AlphaDropout
ModuleAlphaDropout(p: float = 0.5, inplace: bool = False)Alpha dropout — element-wise dropout that preserves SELU self-normalisation.
Standard dropout breaks the zero-mean / unit-variance property of
SELU <https://arxiv.org/abs/1706.02515>_ activations because it
sets units to zero, shifting the mean. Alpha dropout fixes this by
replacing dropped units with a learned negative saturation value
and then applying an affine correction:
\tilde{y} = a \cdot y + b
where with and (the SELU fixed-point constants), and the affine coefficients are chosen so that and after the mask is applied. The result is that the output distribution of each alpha-dropout layer is approximately standard normal, preserving the self-normalising property that makes deep SELU networks trainable without batch normalisation.
Parameters
pfloat= 0.5[0, 1]. Default: 0.5.inplacebool= FalseFalse.Notes
- Input: any shape
(*). - Output: same shape
(*).
Alpha dropout should be used exclusively with lucid.nn.SELU
activations. Using it after other activations (ReLU, tanh, etc.)
will not preserve any statistical invariant and is likely harmful.
In eval mode the layer is the identity (no masking, no affine correction).
Dropout : Standard element-wise dropout. FeatureAlphaDropout : Channel-wise variant of alpha dropout.
Examples
In a self-normalising MLP (SELU + AlphaDropout):
>>> import lucid, lucid.nn as nn
>>> mlp = nn.Sequential(
... nn.Linear(32, 64),
... nn.SELU(),
... nn.AlphaDropout(p=0.05),
... nn.Linear(64, 10),
... )
>>> mlp.train()
>>> y = mlp(lucid.randn(8, 32))
>>> y.shape
(8, 10)
Verify that eval mode is a no-op:
>>> drop = nn.AlphaDropout(p=0.5)
>>> drop.eval()
>>> x = lucid.randn(4, 16)
>>> import lucid.linalg
>>> # Output should equal input exactly in eval mode
>>> out = drop(x)
>>> out.shape
(4, 16)Methods (3)
__init__
→None__init__(p: float = 0.5, inplace: bool = False)Initialise the AlphaDropout module. See the class docstring for parameter semantics.
forward
→Tensorforward(x: Tensor)Apply dropout to the input tensor.
Parameters
inputTensorReturns
TensorOutput tensor of the same shape as input; in eval mode this is
the identity.
extra_repr
→strextra_repr()Return a string representation of the layer's configuration.