orthogonal_

→Tensor

orthogonal_(tensor: Tensor, gain: float = 1.0)

source edit

Initialise tensor in-place with a (semi-)orthogonal matrix.

A random Gaussian matrix is drawn and orthonormalised via QR decomposition; the resulting matrix Q is multiplied by gain and written into tensor. For tensors with rank > 2 the leading axis is flattened against the remaining axes, the 2-D matrix is orthogonalised, and the original shape is restored.

Orthogonal initialisation, proposed by Saxe et al. (2013), preserves vector norms exactly through deep linear networks ( $\|Wx\|_2 = \|x\|_2$ because W has orthonormal columns), keeping the spectrum of the forward Jacobian on the unit sphere. This is particularly valuable for RNNs and pre-norm transformer blocks where activation magnitudes can otherwise drift exponentially with depth.

Parameters

tensorTensor

Tensor to initialise in place; must have at least 2 dimensions.

gainfloat= 1.0

Multiplicative scaling applied to Q. Default 1.0.

Returns

Tensor

tensor (mutated) for chaining.

Raises

ValueError

If tensor.ndim < 2.

Notes

The matrix satisfies $Q^\top Q = \text{gain}^2 I$ (or $Q Q^\top = \text{gain}^2 I$ when fan-out exceeds fan-in).

Examples

>>> import lucid
>>> from lucid.nn.init import orthogonal_
>>> w = lucid.empty(64, 64)
>>> orthogonal_(w, gain=1.0)