GroupNorm
ModuleGroupNorm(num_groups: int, num_channels: int, eps: float = 1e-05, affine: bool = True, device: DeviceLike = None, dtype: DTypeLike = None)Group normalization over the channel dimension.
Divides the channels into num_groups contiguous groups
of size and normalises each group
independently over its spatial elements:
where and are the mean and variance computed over a single group (channels + spatial axes) for each sample in the batch.
Group Norm sits between two extremes: num_groups=1 recovers
Layer Norm (normalize over all channels at once), while
num_groups=num_channels recovers Instance Norm (each channel is
its own group). Unlike Batch Norm, Group Norm statistics are
independent of the batch size, making it stable for small batches
and well-suited to detection and segmentation models.
Parameters
num_groupsintnum_channels must be divisible by num_groups.num_channelsintepsfloat= 1e-051e-5.affinebool= TrueTrue, learns per-channel scale and shift
of shape (num_channels,). Default: True.deviceDeviceLike= NoneNone.dtypeDTypeLike= NoneNone.Attributes
weightParameter or None(num_channels,), initialised to ones.
None when affine=False.biasParameter or None(num_channels,), initialised to zeros.
None when affine=False.Notes
- Input: where denotes zero or more spatial dimensions and .
- Output: same shape as the input.
num_channelsmust be divisible bynum_groups; aValueErroris raised at the functional level if this is violated.- Despite sharing a name with batch-norm affine parameters, the
weightandbiashere have shape(num_channels,)rather than being element-wise over the full normalized region.
Examples
32-channel input split into 8 groups:
>>> import lucid
>>> import lucid.nn as nn
>>> gn = nn.GroupNorm(num_groups=8, num_channels=32)
>>> x = lucid.randn(4, 32, 64, 64)
>>> out = gn(x)
>>> out.shape
(4, 32, 64, 64)
Layer-Norm equivalent (single group) on a 1-D sequence:
>>> gn_layer = nn.GroupNorm(num_groups=1, num_channels=128)
>>> x_seq = lucid.randn(16, 128, 200) # (N, C, L)
>>> out_seq = gn_layer(x_seq)
>>> out_seq.shape
(16, 128, 200)Methods (3)
__init__
→None__init__(num_groups: int, num_channels: int, eps: float = 1e-05, affine: bool = True, device: DeviceLike = None, dtype: DTypeLike = None)Initialise the GroupNorm module. See the class docstring for parameter semantics.
forward
→Tensorforward(x: Tensor)Apply normalisation to the input tensor.
Parameters
inputTensorReturns
TensorNormalised tensor of the same shape as input.
extra_repr
→strextra_repr()Return a string representation of the layer's configuration.