class

ConcatDataset

extendsDataset
ConcatDataset(datasets: list[Dataset])
source

Dataset formed by concatenating several map-style datasets end-to-end.

Sample i is fetched from the first child whose cumulative length exceeds i, with the relative index translated accordingly.

Parameters

datasetslist of Dataset
Child datasets, concatenated in order. Each child must implement __len__ and __getitem__.

Notes

A list of cumulative length offsets is maintained internally at construction time so that index resolution reduces to a single bounded scan (and could be upgraded to a binary search for O(log n) lookup on large child lists). Children are stored by reference; no per-sample copy is made.

Examples

>>> combined = ConcatDataset([ds_a, ds_b, ds_c])
>>> len(combined) == len(ds_a) + len(ds_b) + len(ds_c)
True

Methods (3)

dunder

__init__

None
__init__(datasets: list[Dataset])
source

Initialise the instance. See the class docstring for parameter semantics.

dunder

__len__

int
__len__()
source

Return the total length — sum of all child dataset lengths.

dunder

__getitem__

Tensor or tuple of Tensor
__getitem__(idx: int)
source

Return the sample at the given global index.

Parameters

idxint
Index into the concatenated dataset. Negative indices are translated relative to the total length.

Returns

Tensor or tuple of Tensor

Sample fetched from the appropriate child dataset.