Dataset
Dataset()Abstract base class for map-style datasets.
Subclasses must implement __len__ (total number of samples) and
__getitem__ (sample retrieval by integer index). Together these
two methods constitute the map-style dataset protocol used by
lucid.utils.data.DataLoader.
Notes
Map-style datasets are random-access: any index 0 <= i < len(ds)
can be fetched at any time, which is what lets samplers (such as
lucid.utils.data.RandomSampler or
lucid.utils.data.BatchSampler) drive iteration. If the data
source does not support random access (e.g., a streaming log), use
IterableDataset instead.
Examples
>>> class Squares(Dataset):
... def __init__(self, n): self.n = n
... def __len__(self): return self.n
... def __getitem__(self, i): return i * i
>>> ds = Squares(5)
>>> ds[3]
9Methods (3)
__getitem__
→Tensor or tuple of Tensor__getitem__(index: int)Retrieve a single sample by integer index.
Parameters
indexint0 <= index < len(self); negative indexing is not
part of the protocol.Returns
Tensor or tuple of TensorThe sample at the given index. Multi-output datasets typically
return a tuple such as (input, target).
__len__
→int__len__()Return the number of samples in the dataset.
Returns
intTotal number of samples available via __getitem__.
__add__
→ConcatDataset__add__(other: Dataset)Concatenate this dataset with another via the + operator.
Parameters
otherDatasetself.Returns
ConcatDatasetA ConcatDataset wrapping [self, other].