fn

kaiming_uniform_

Tensor
kaiming_uniform_(tensor: Tensor, a: float = 0, mode: str = 'fan_in', nonlinearity: str = 'leaky_relu')
source

Initialise tensor in-place with Kaiming (He) uniform initialisation.

Draws each entry uniformly from [b,b][-b, b] with the bound chosen so that the variance of activations is preserved across a stack of ReLU-family layers. Introduced in He et al. (2015), this scheme corrects the Xavier formula for the fact that ReLU zeroes half of its pre-activations, which would otherwise halve the forward variance at every layer.

Parameters

tensorTensor
Tensor to initialise in place; must have at least 2 dimensions.
afloat= 0
Negative slope of the rectifier used after this layer (only used when nonlinearity='leaky_relu'). Default 0.
mode(fan_in, fan_out)= 'fan_in'
Which fan to use as the variance scaler. 'fan_in' (default) preserves the magnitude of activations in the forward pass; 'fan_out' preserves the magnitude of gradients in the backward pass.
nonlinearitystr= 'leaky_relu'
Nonlinearity name forwarded to calculate_gain. Default 'leaky_relu'.

Returns

Tensor

tensor (mutated) for chaining.

Notes

With n=fann = \text{fan} (selected by mode) the bound is

b=6ngain,b = \sqrt{\frac{6}{n}} \cdot \text{gain},

giving variance

Var(W)=gain2n=2nfor ReLU.\mathrm{Var}(W) = \frac{\text{gain}^2}{n} = \frac{2}{n} \quad\text{for ReLU.}

Examples

>>> import lucid
>>> from lucid.nn.init import kaiming_uniform_
>>> w = lucid.empty(64, 32)
>>> kaiming_uniform_(w, nonlinearity='relu')