class

Dropout

extendsModule
Dropout(p: float = 0.5, inplace: bool = False)
source

Randomly zero individual tensor elements during training (inverted dropout).

During training, each scalar element of the input is independently set to zero with probability pp. The remaining elements are rescaled by 11p\frac{1}{1-p} so that the expected value of every element is preserved — this is the inverted dropout convention, which means no rescaling is needed at inference time:

yi={xi1pwith probability 1p0with probability p(training)y_i = \begin{cases} \dfrac{x_i}{1 - p} & \text{with probability } 1 - p \\[6pt] 0 & \text{with probability } p \end{cases} \quad \text{(training)} yi=xi(eval)y_i = x_i \quad \text{(eval)}

In eval mode the layer is the identity and the p parameter has no effect.

Why dropout works. By randomly disabling units, dropout prevents co-adaptation — individual neurons cannot rely on the presence of specific peers, so they are forced to learn more robust features. Dropout is approximately equivalent to averaging the predictions of an ensemble of 2n2^n sub-networks (one per binary mask).

Parameters

pfloat= 0.5
Probability of zeroing each element. Must be in [0, 1]. p=0 disables dropout; p=1 zeros the entire tensor. Default: 0.5.
inplacebool= False
If True, modify the input tensor in place. Use with care when the input participates in the autograd graph. Default: False.

Notes

  • Input: any shape (*).
  • Output: same shape (*).

Dropout should only be applied during training. Call model.eval() before inference to switch all dropout layers to pass-through mode; call model.train() to re-enable them.

For convolutional feature maps where adjacent spatial positions are highly correlated, per-element dropout is ineffective — consider Dropout2d instead.

Dropout1d : Channel-wise dropout for 3-D inputs. Dropout2d : Channel-wise dropout for 4-D (image) inputs. Dropout3d : Channel-wise dropout for 5-D (volumetric) inputs. AlphaDropout : Dropout variant that preserves SELU statistics.

Examples

Basic usage in a linear classifier head:
>>> import lucid, lucid.nn as nn
>>> drop = nn.Dropout(p=0.3)
>>> drop.train()
>>> x = lucid.ones(4, 8)
>>> y = drop(x)
>>> # Approximately 30 % of elements are zero; rest scaled by 1/0.7
>>> y.shape
(4, 8)
Disabled in eval mode:
>>> drop.eval()
>>> y_eval = drop(lucid.ones(4, 8))
>>> # All elements equal 1.0 — no masking
>>> float(y_eval.sum()) == 32.0
True

Methods (3)

dunder

__init__

None
__init__(p: float = 0.5, inplace: bool = False)
source

Initialise the Dropout module. See the class docstring for parameter semantics.

fn

forward

Tensor
forward(x: Tensor)
source

Apply dropout to the input tensor.

Parameters

inputTensor
Input tensor of arbitrary shape.

Returns

Tensor

Output tensor of the same shape as input; in eval mode this is the identity.

fn

extra_repr

str
extra_repr()
source

Return a string representation of the layer's configuration.