fn

xavier_uniform_

Tensor
xavier_uniform_(tensor: Tensor, gain: float = 1.0)
source

Initialise tensor in-place with Xavier (Glorot) uniform initialisation.

Fills the tensor with values drawn uniformly from [a,a][-a, a] where the limit a is chosen so that the variance of activations is preserved across a stack of linear / mildly-nonlinear layers. Introduced in Glorot & Bengio (2010), this scheme is well-suited to tanh and sigmoid networks; for ReLU networks prefer kaiming_uniform_, which corrects for the half-truncation of negative pre-activations.

Parameters

tensorTensor
Tensor to initialise in place; must have at least 2 dimensions so fan_in and fan_out can be computed.
gainfloat= 1.0
Multiplicative gain factor — typically the value returned by calculate_gain for the downstream nonlinearity. Default 1.0.

Returns

Tensor

tensor (mutated) for chaining.

Notes

Let ninn_\text{in} and noutn_\text{out} be the fan-in and fan-out of tensor (see _calculate_fan_in_and_fan_out). The uniform range is

a=gain6nin+nout,a = \text{gain} \cdot \sqrt{\frac{6}{n_\text{in} + n_\text{out}}},

which yields variance

Var(W)=2gain2nin+nout.\mathrm{Var}(W) = \frac{2 \cdot \text{gain}^2}{n_\text{in} + n_\text{out}}.

This is the value that approximately preserves the variance of activations forwards and gradients backwards in a linear layer.

Examples

>>> import lucid
>>> from lucid.nn.init import xavier_uniform_, calculate_gain
>>> w = lucid.empty(64, 32)
>>> xavier_uniform_(w, gain=calculate_gain('tanh'))