class

AlphaDropout

extendsModule
AlphaDropout(p: float = 0.5, inplace: bool = False)
source

Alpha dropout — element-wise dropout that preserves SELU self-normalisation.

Standard dropout breaks the zero-mean / unit-variance property of SELU <https://arxiv.org/abs/1706.02515>_ activations because it sets units to zero, shifting the mean. Alpha dropout fixes this by replacing dropped units with a learned negative saturation value α\alpha' and then applying an affine correction:

y={xwith probability 1pαwith probability py = \begin{cases} x & \text{with probability } 1 - p \\ \alpha' & \text{with probability } p \end{cases}
\tilde{y} = a \cdot y + b

where α=λα\alpha' = -\lambda\alpha with λ1.0507\lambda \approx 1.0507 and α1.6733\alpha \approx 1.6733 (the SELU fixed-point constants), and the affine coefficients a,ba, b are chosen so that E[y~]=0\mathbb{E}[\tilde{y}] = 0 and Var[y~]=1\operatorname{Var}[\tilde{y}] = 1 after the mask is applied. The result is that the output distribution of each alpha-dropout layer is approximately standard normal, preserving the self-normalising property that makes deep SELU networks trainable without batch normalisation.

Parameters

pfloat= 0.5
Probability of replacing an element with α\alpha'. Must be in [0, 1]. Default: 0.5.
inplacebool= False
Currently accepted for API compatibility but has no effect (the affine correction always produces a new tensor). Default: False.

Notes

  • Input: any shape (*).
  • Output: same shape (*).

Alpha dropout should be used exclusively with lucid.nn.SELU activations. Using it after other activations (ReLU, tanh, etc.) will not preserve any statistical invariant and is likely harmful.

In eval mode the layer is the identity (no masking, no affine correction).

Dropout : Standard element-wise dropout. FeatureAlphaDropout : Channel-wise variant of alpha dropout.

Examples

In a self-normalising MLP (SELU + AlphaDropout):
>>> import lucid, lucid.nn as nn
>>> mlp = nn.Sequential(
...     nn.Linear(32, 64),
...     nn.SELU(),
...     nn.AlphaDropout(p=0.05),
...     nn.Linear(64, 10),
... )
>>> mlp.train()
>>> y = mlp(lucid.randn(8, 32))
>>> y.shape
(8, 10)
Verify that eval mode is a no-op:
>>> drop = nn.AlphaDropout(p=0.5)
>>> drop.eval()
>>> x = lucid.randn(4, 16)
>>> import lucid.linalg
>>> # Output should equal input exactly in eval mode
>>> out = drop(x)
>>> out.shape
(4, 16)

Methods (3)

dunder

__init__

None
__init__(p: float = 0.5, inplace: bool = False)
source

Initialise the AlphaDropout module. See the class docstring for parameter semantics.

fn

forward

Tensor
forward(x: Tensor)
source

Apply dropout to the input tensor.

Parameters

inputTensor
Input tensor of arbitrary shape.

Returns

Tensor

Output tensor of the same shape as input; in eval mode this is the identity.

fn

extra_repr

str
extra_repr()
source

Return a string representation of the layer's configuration.