nn.functional.group_norm¶
- lucid.nn.functional.group_norm(input_: Tensor, num_groups: int, weight: Tensor | None = None, bias: Tensor | None = None, eps: float = 1e-05) Tensor ¶
The group_norm function applies Group Normalization over a mini-batch of inputs. Unlike BatchNorm or InstanceNorm, GroupNorm divides the channels into groups and computes within each group.
Function Signature¶
def group_norm(
input_: Tensor,
num_groups: int,
weight: Tensor | None = None,
bias: Tensor | None = None,
eps: float = 1e-5,
) -> Tensor
Parameters¶
input_ (Tensor): Input tensor of shape \((N, C, *)\), where \(*\) can be any number of additional dimensions.
num_groups (int): Number of groups to divide the channels into. Must divide the number of channels evenly.
weight (Tensor, optional): Learnable affine scale tensor of shape \((C,)\). If provided, it is applied after normalization.
bias (Tensor, optional): Learnable affine bias tensor of shape \((C,)\). If provided, it is added after scaling.
eps (float, optional): A small value added to the denominator for numerical stability. Default is 1e-5.
Forward Computation¶
GroupNorm computes the mean and variance across each group of channels and normalizes accordingly:
where \(m = \frac{C}{G} \times \text{prod}(*shape[2:])\) and \(G\) is the number of groups.
Tip
If num_groups == C, this becomes Instance Normalization. If num_groups == 1, it becomes Layer Normalization over channels.
Returns¶
- Tensor:
The normalized tensor with the same shape as the input.
Examples¶
>>> import lucid.nn.functional as F
>>> x = Tensor.randn(2, 6, 4, 4, requires_grad=True)
>>> out = F.group_norm(x, num_groups=3)
>>> print(out.shape)
(2, 6, 4, 4)
>>> # With affine parameters
>>> weight = Tensor.ones(6, requires_grad=True)
>>> bias = Tensor.zeros(6, requires_grad=True)
>>> out = F.group_norm(x, num_groups=3, weight=weight, bias=bias)
>>> out.backward()
Warning
The number of channels (C) must be divisible by num_groups. Otherwise, this function will raise a ValueError.