kaiming_uniform_

→Tensor

kaiming_uniform_(tensor: Tensor, a: float = 0, mode: str = 'fan_in', nonlinearity: str = 'leaky_relu')

source edit

Initialise tensor in-place with Kaiming (He) uniform initialisation.

Draws each entry uniformly from $[-b, b]$ with the bound chosen so that the variance of activations is preserved across a stack of ReLU-family layers. Introduced in He et al. (2015), this scheme corrects the Xavier formula for the fact that ReLU zeroes half of its pre-activations, which would otherwise halve the forward variance at every layer.

Parameters

tensorTensor

Tensor to initialise in place; must have at least 2 dimensions.

afloat= 0

Negative slope of the rectifier used after this layer (only used when nonlinearity='leaky_relu'). Default 0.

mode(fan_in, fan_out)= 'fan_in'

Which fan to use as the variance scaler. 'fan_in' (default) preserves the magnitude of activations in the forward pass; 'fan_out' preserves the magnitude of gradients in the backward pass.

nonlinearitystr= 'leaky_relu'

Nonlinearity name forwarded to calculate_gain. Default 'leaky_relu'.

Returns

Tensor

tensor (mutated) for chaining.

Notes

With $n = \text{fan}$ (selected by mode) the bound is

b = \sqrt{\frac{6}{n}} \cdot \text{gain},

giving variance

\mathrm{Var}(W) = \frac{\text{gain}^2}{n} = \frac{2}{n} \quad\text{for ReLU.}

Examples

>>> import lucid
>>> from lucid.nn.init import kaiming_uniform_
>>> w = lucid.empty(64, 32)
>>> kaiming_uniform_(w, nonlinearity='relu')