class

Dropout

extendsModule

Dropout(p: float = 0.5, inplace: bool = False)

source edit

Implementing kernel

C++DropoutBackwardclass

Randomly zero individual tensor elements during training (inverted dropout).

During training, each scalar element of the input is independently set to zero with probability $p$ . The remaining elements are rescaled by $\frac{1}{1-p}$ so that the expected value of every element is preserved — this is the inverted dropout convention, which means no rescaling is needed at inference time:

y_i = \begin{cases} \dfrac{x_i}{1 - p} & \text{with probability } 1 - p \\[6pt] 0 & \text{with probability } p \end{cases} \quad \text{(training)}

y_i = x_i \quad \text{(eval)}

In eval mode the layer is the identity and the p parameter has no effect.

Why dropout works. By randomly disabling units, dropout prevents co-adaptation — individual neurons cannot rely on the presence of specific peers, so they are forced to learn more robust features. Dropout is approximately equivalent to averaging the predictions of an ensemble of $2^n$ sub-networks (one per binary mask).

Parameters

pfloat= 0.5

Probability of zeroing each element. Must be in [0, 1]. p=0 disables dropout; p=1 zeros the entire tensor. Default: 0.5.

inplacebool= False

If True, modify the input tensor in place. Use with care when the input participates in the autograd graph. Default: False.

Notes

Input: any shape (*).
Output: same shape (*).

Dropout should only be applied during training. Call model.eval() before inference to switch all dropout layers to pass-through mode; call model.train() to re-enable them.

For convolutional feature maps where adjacent spatial positions are highly correlated, per-element dropout is ineffective — consider Dropout2d instead.

Examples

Basic usage in a linear classifier head:
>>> import lucid, lucid.nn as nn
>>> drop = nn.Dropout(p=0.3)
>>> drop.train()
>>> x = lucid.ones(4, 8)
>>> y = drop(x)
>>> # Approximately 30 % of elements are zero; rest scaled by 1/0.7
>>> y.shape
(4, 8)
Disabled in eval mode:
>>> drop.eval()
>>> y_eval = drop(lucid.ones(4, 8))
>>> # All elements equal 1.0 — no masking
>>> float(y_eval.sum()) == 32.0
True

Used by 2

Constructors

dunder

init

→None

__init__(p: float = 0.5, inplace: bool = False)

source edit

Initialise the Dropout module. See the class docstring for parameter semantics.

Instance methods

extra_repr

→str

extra_repr()

source edit

Return a string representation of the layer's configuration.

forward

→Tensor

forward(x: Tensor)

source edit

Apply dropout to the input tensor.

Parameters

inputTensor

Input tensor of arbitrary shape.

Returns

Tensor

Output tensor of the same shape as input; in eval mode this is the identity.

Basic usage in a linear classifier head: >>> import lucid, lucid.nn as nn >>> drop = nn.Dropout(p=0.3) >>> drop.train() >>> x = lucid.ones(4, 8) >>> y = drop(x) >>> # Approximately 30 % of elements are zero; rest scaled by 1/0.7 >>> y.shape (4, 8) Disabled in eval mode: >>> drop.eval() >>> y_eval = drop(lucid.ones(4, 8)) >>> # All elements equal 1.0 — no masking >>> float(y_eval.sum()) == 32.0 True

Dropout

Implementing kernel

Parameters

Notes

Examples

See Also

Used by 2

Constructors

init

Instance methods

extra_repr

forward

Parameters

Returns

Dropout

Implementing kernel

Parameters

Notes

Examples

See Also

Used by 2

Constructors

init

Instance methods

extra_repr

forward

Parameters

Returns