class

SELU

extendsModule

SELU(inplace: bool = False)

source edit

Scaled Exponential Linear Unit activation function.

Applies element-wise:

\text{SELU}(x) = \lambda \begin{cases} x & \text{if } x > 0 \\ \alpha \left(e^{x} - 1\right) & \text{otherwise} \end{cases}

where $\lambda \approx 1.0507$ and $\alpha \approx 1.6733$ are fixed constants derived from the self-normalising property. When weights are initialised with lecun_normal and the network uses only SELU activations, the mean and variance of each layer's output converge to 0 and 1, enabling stable training of very deep fully-connected networks without batch normalisation.

Parameters

inplacebool= False

If True, modifies the input tensor in-place. Default: False.

Notes

Input: $(*)$ — any shape.
Output: $(*)$ — same shape as input.

The self-normalising guarantee holds strictly only for networks with no skip connections, no convolutional layers, and Lecun-normal weight initialisation. In practice SELU is used more broadly as a drop-in for ELU.

Examples

>>> import lucid
>>> import lucid.nn as nn
>>> m = nn.SELU()
>>> x = lucid.tensor([-2.0, -1.0, 0.0, 1.0, 2.0])
>>> m(x)
tensor([-1.5202, -1.1113,  0.    ,  1.0507,  2.1014])
>>> # Suitable for deep fully-connected architectures
>>> layers = nn.Sequential(nn.Linear(128, 64), nn.SELU(), nn.Linear(64, 10))
>>> x = lucid.randn(8, 128)
>>> out = layers(x)
>>> out.shape
(8, 10)

Used by 1

lucid.nn.modules

Constructors

dunder

init

→None

__init__(inplace: bool = False)

source edit

Initialise the SELU module. See the class docstring for parameter semantics.

Instance methods

forward

→Tensor

forward(x: Tensor)

source edit

Apply the activation function element-wise.

Parameters

inputTensor

Input tensor of arbitrary shape.

Returns

Tensor

Output tensor of the same shape as input.

>>> import lucid >>> import lucid.nn as nn >>> m = nn.SELU() >>> x = lucid.tensor([-2.0, -1.0, 0.0, 1.0, 2.0]) >>> m(x) tensor([-1.5202, -1.1113, 0. , 1.0507, 2.1014]) >>> # Suitable for deep fully-connected architectures >>> layers = nn.Sequential(nn.Linear(128, 64), nn.SELU(), nn.Linear(64, 10)) >>> x = lucid.randn(8, 128) >>> out = layers(x) >>> out.shape (8, 10)