fn

orthogonal_

Tensor
orthogonal_(tensor: Tensor, gain: float = 1.0)
source

Initialise tensor in-place with a (semi-)orthogonal matrix.

A random Gaussian matrix is drawn and orthonormalised via QR decomposition; the resulting matrix Q is multiplied by gain and written into tensor. For tensors with rank > 2 the leading axis is flattened against the remaining axes, the 2-D matrix is orthogonalised, and the original shape is restored.

Orthogonal initialisation, proposed by Saxe et al. (2013), preserves vector norms exactly through deep linear networks (Wx2=x2\|Wx\|_2 = \|x\|_2 because W has orthonormal columns), keeping the spectrum of the forward Jacobian on the unit sphere. This is particularly valuable for RNNs and pre-norm transformer blocks where activation magnitudes can otherwise drift exponentially with depth.

Parameters

tensorTensor
Tensor to initialise in place; must have at least 2 dimensions.
gainfloat= 1.0
Multiplicative scaling applied to Q. Default 1.0.

Returns

Tensor

tensor (mutated) for chaining.

Raises

ValueError
If tensor.ndim < 2.

Notes

The matrix satisfies QQ=gain2IQ^\top Q = \text{gain}^2 I (or QQ=gain2IQ Q^\top = \text{gain}^2 I when fan-out exceeds fan-in).

Examples

>>> import lucid
>>> from lucid.nn.init import orthogonal_
>>> w = lucid.empty(64, 64)
>>> orthogonal_(w, gain=1.0)