WorkerInfo
WorkerInfo(id: int, num_workers: int, seed: int, dataset: Dataset)Per-worker context published inside a DataLoader worker process.
Returned by get_worker_info when called from within a
worker's __getitem__ / __iter__ / worker_init_fn. Lets
user code shard work, seed RNGs differently per worker, or open
per-worker file handles.
Parameters
idint[0, num_workers).num_workersintDataLoader.seedintbase_seed + id). The
loader seeds Python random and (when available) numpy
with this value before invoking worker_init_fn.datasetDatasetspawn is
used, this is a deep-copied instance — mutations in one worker
are not visible to others.Notes
Use get_worker_info rather than constructing this dataclass
directly; the per-thread storage is what actually wires it up.
Examples
>>> def worker_init_fn(worker_id):
... info = get_worker_info()
... # Shard an IterableDataset across workers:
... info.dataset.start = info.id * shard_sizeMethods (1)
__init__
→None__init__(id: int, num_workers: int, seed: int, dataset: Dataset)Configure a DataLoader; see the class docstring for parameter
semantics.
Raises
ValueErrorNotes
sampler / batch_sampler / shuffle / batch_size /
drop_last interact: passing batch_sampler precludes the
other four; passing sampler precludes shuffle. When no
sampler is supplied, a SequentialSampler (shuffle=False)
or RandomSampler (shuffle=True) is constructed
automatically. persistent_workers requires num_workers > 0.