fn

group_norm

Tensor
group_norm(x: Tensor, num_groups: int, weight: Tensor | None = None, bias: Tensor | None = None, eps: float = 1e-05)
source

Group normalization (Wu & He, 2018).

Splits the channel dimension into num_groups contiguous groups and normalises each (sample, group) slice independently across its channels and spatial axes. Combines the spatial reduction of BatchNorm with the per-sample stability of LayerNorm — performance is therefore largely independent of batch size, which makes it the go-to choice for detection / segmentation models trained with very small batches.

Parameters

xTensor
Input of shape (N, C, *spatial) where C must be divisible by num_groups.
num_groupsint
Number of channel groups. Two limiting cases: num_groups == C reduces to InstanceNorm; num_groups == 1 reduces to LayerNorm over channels + spatial axes.
weightTensor= None
Per-channel scale γ\gamma of shape (C,).
biasTensor= None
Per-channel shift β\beta of shape (C,).
epsfloat= 1e-05
Numerical safety added inside the square root.

Returns

Tensor

Same shape as x.

Notes

Math (let GgG_g be the channel set of group gg and SS the spatial axes):

μn,g=1GgScGgsSxn,c,sσn,g2=1GgScGgsS(xn,c,sμn,g)2yn,c,s=γcxn,c,sμn,g(c)σn,g(c)2+ϵ+βc\begin{aligned} \mu_{n,g} &= \frac{1}{|G_g||S|} \sum_{c \in G_g} \sum_{s \in S} x_{n,c,s} \\ \sigma^2_{n,g} &= \frac{1}{|G_g||S|} \sum_{c \in G_g} \sum_{s \in S} (x_{n,c,s} - \mu_{n,g})^2 \\ y_{n,c,s} &= \gamma_c \cdot \frac{x_{n,c,s} - \mu_{n,g(c)}}{\sqrt{\sigma^2_{n,g(c)} + \epsilon}} + \beta_c \end{aligned}

Independence from batch size avoids the train/eval mismatch that BatchNorm requires running buffers to fix.

Examples

>>> import lucid
>>> from lucid.nn.functional import group_norm
>>> x = lucid.randn(2, 32, 16, 16)
>>> y = group_norm(x, num_groups=8)
>>> y.shape
(2, 32, 16, 16)