fn
random_split
→list of Subsetrandom_split(dataset: Dataset, lengths: list[int] | list[float], generator: object = None)Randomly split a dataset into non-overlapping Subset views.
Shuffles range(len(dataset)) and slices it into chunks of the
requested lengths, wrapping each slice in a Subset. The
children do not copy the underlying samples — they hold the parent
dataset by reference.
Parameters
datasetDatasetSource dataset to split.
lengthslist of int or list of floatEither absolute split sizes summing to
len(dataset), or
fractions in [0, 1] summing (approximately) to 1.0. In
the fractional case, rounding error is absorbed by the final
split so the totals stay consistent.generatoroptional= NoneSeed-like object forwarded to
random.Random for
reproducibility. If None, the global random state is
used.Returns
list of SubsetOne Subset per requested split, in registration order.
Raises
ValueErrorIf fractional
lengths do not sum to 1.0 (within 1e-6)
or integer lengths do not sum to len(dataset).Notes
The split is permutation-based: range(len(dataset)) is shuffled
once and then sliced into the requested chunks. Reproducibility is
obtained by seeding the global RNG via lucid.manual_seed, or
by passing an explicit generator seed; the same generator state
always yields the same partition.
Examples
>>> full = TensorDataset(X, y)
>>> train, val, test = random_split(full, [0.8, 0.1, 0.1])
>>> len(train), len(val), len(test)
(80, 10, 10)