class

Dataset

Dataset()
source

Abstract base class for map-style datasets.

Subclasses must implement __len__ (total number of samples) and __getitem__ (sample retrieval by integer index). Together these two methods constitute the map-style dataset protocol used by lucid.utils.data.DataLoader.

Notes

Map-style datasets are random-access: any index 0 <= i < len(ds) can be fetched at any time, which is what lets samplers (such as lucid.utils.data.RandomSampler or lucid.utils.data.BatchSampler) drive iteration. If the data source does not support random access (e.g., a streaming log), use IterableDataset instead.

Examples

>>> class Squares(Dataset):
...     def __init__(self, n): self.n = n
...     def __len__(self): return self.n
...     def __getitem__(self, i): return i * i
>>> ds = Squares(5)
>>> ds[3]
9

Methods (3)

dunder

__getitem__

Tensor or tuple of Tensor
__getitem__(index: int)
source

Retrieve a single sample by integer index.

Parameters

indexint
Position of the sample to return. Implementations are expected to support 0 <= index < len(self); negative indexing is not part of the protocol.

Returns

Tensor or tuple of Tensor

The sample at the given index. Multi-output datasets typically return a tuple such as (input, target).

dunder

__len__

int
__len__()
source

Return the number of samples in the dataset.

Returns

int

Total number of samples available via __getitem__.

dunder

__add__

ConcatDataset
__add__(other: Dataset)
source

Concatenate this dataset with another via the + operator.

Parameters

otherDataset
Dataset whose samples should follow those of self.

Returns

ConcatDataset

A ConcatDataset wrapping [self, other].