fn

random_split

list of Subset
random_split(dataset: Dataset, lengths: list[int] | list[float], generator: object = None)
source

Randomly split a dataset into non-overlapping Subset views.

Shuffles range(len(dataset)) and slices it into chunks of the requested lengths, wrapping each slice in a Subset. The children do not copy the underlying samples — they hold the parent dataset by reference.

Parameters

datasetDataset
Source dataset to split.
lengthslist of int or list of float
Either absolute split sizes summing to len(dataset), or fractions in [0, 1] summing (approximately) to 1.0. In the fractional case, rounding error is absorbed by the final split so the totals stay consistent.
generatoroptional= None
Seed-like object forwarded to random.Random for reproducibility. If None, the global random state is used.

Returns

list of Subset

One Subset per requested split, in registration order.

Raises

ValueError
If fractional lengths do not sum to 1.0 (within 1e-6) or integer lengths do not sum to len(dataset).

Notes

The split is permutation-based: range(len(dataset)) is shuffled once and then sliced into the requested chunks. Reproducibility is obtained by seeding the global RNG via lucid.manual_seed, or by passing an explicit generator seed; the same generator state always yields the same partition.

Examples

>>> full = TensorDataset(X, y)
>>> train, val, test = random_split(full, [0.8, 0.1, 0.1])
>>> len(train), len(val), len(test)
(80, 10, 10)