xavier_uniform_

→Tensor

xavier_uniform_(tensor: Tensor, gain: float = 1.0)

source edit

Initialise tensor in-place with Xavier (Glorot) uniform initialisation.

Fills the tensor with values drawn uniformly from $[-a, a]$ where the limit a is chosen so that the variance of activations is preserved across a stack of linear / mildly-nonlinear layers. Introduced in Glorot & Bengio (2010), this scheme is well-suited to tanh and sigmoid networks; for ReLU networks prefer kaiming_uniform_, which corrects for the half-truncation of negative pre-activations.

Parameters

tensorTensor

Tensor to initialise in place; must have at least 2 dimensions so fan_in and fan_out can be computed.

gainfloat= 1.0

Multiplicative gain factor — typically the value returned by calculate_gain for the downstream nonlinearity. Default 1.0.

Returns

Tensor

tensor (mutated) for chaining.

Notes

Let $n_\text{in}$ and $n_\text{out}$ be the fan-in and fan-out of tensor (see _calculate_fan_in_and_fan_out). The uniform range is

a = \text{gain} \cdot \sqrt{\frac{6}{n_\text{in} + n_\text{out}}},

which yields variance

\mathrm{Var}(W) = \frac{2 \cdot \text{gain}^2}{n_\text{in} + n_\text{out}}.

This is the value that approximately preserves the variance of activations forwards and gradients backwards in a linear layer.

Examples

>>> import lucid
>>> from lucid.nn.init import xavier_uniform_, calculate_gain
>>> w = lucid.empty(64, 32)
>>> xavier_uniform_(w, gain=calculate_gain('tanh'))