nn.InstanceNorm2d¶
- class lucid.nn.InstanceNorm2d(num_features: int, eps: float = 1e-05, momentum: float | None = 0.1, affine: bool = True, track_running_stats: bool = True)¶
The InstanceNorm2d module applies Instance Normalization over a two-dimensional input (a batch of 2D inputs with optional additional dimensions).
Instance Normalization normalizes the input features for each instance and channel separately by maintaining the mean and variance of each feature across the spatial dimensions. This normalization technique is particularly useful in tasks like image generation and style transfer, where per-instance normalization can lead to better performance by reducing instance-specific contrast variations.
Class Signature¶
class lucid.nn.InstanceNorm2d(
num_features: int,
eps: float = 1e-5,
momentum: float | None = 0.1,
affine: bool = True,
track_running_stats: bool = True,
) -> None
Parameters¶
- num_features (int):
The number of features or channels in the input tensor. For a 2D input, this typically corresponds to the number of channels.
- eps (float, optional):
A small value added to the denominator for numerical stability. Default is 1e-5.
- momentum (float or None, optional):
The value used for the running mean and running variance computation. When set to None, it defaults to 1 - momentum in some frameworks. Default is 0.1.
- affine (bool, optional):
If True, the module has learnable affine parameters (scale and shift). Default is True.
- track_running_stats (bool, optional):
If True, the module tracks the running mean and variance, which are not trainable but are updated during training and used during evaluation. Default is True.
Attributes¶
weight (Tensor or None): The learnable scale parameter \(\gamma\) of shape (num_features). Only present if affine is True.
bias (Tensor or None): The learnable shift parameter \(\beta\) of shape (num_features). Only present if affine is True.
running_mean (Tensor): The running mean of shape (num_features). Updated during training if track_running_stats is True.
running_var (Tensor): The running variance of shape (num_features). Updated during training if track_running_stats is True.
Forward Calculation¶
The InstanceNorm2d module normalizes the input tensor and optionally applies a scale and shift transformation. The normalization is performed as follows:
Where:
\(\mathbf{x}\) is the input tensor of shape \((N, C, H, W)\) where: - \(N\) is the batch size. - \(C\) is the number of channels/features. - \(H\) and \(W\) are the height and width of the input.
\(\mu\) is the mean of the input over the spatial dimensions \(H\) and \(W\) for each instance and channel.
\(\sigma^2\) is the variance of the input over the spatial dimensions \(H\) and \(W\) for each instance and channel.
\(\epsilon\) is a small constant for numerical stability.
\(\gamma\) and \(\beta\) are the learnable scale and shift parameters, respectively.
Backward Gradient Calculation¶
During backpropagation, gradients are computed with respect to the input, scale (\(\gamma\)), and shift (\(\beta\)) parameters.
The gradient calculations are as follows:
Gradient with respect to \(\mathbf{x}\):
Gradient with respect to \(\gamma\):
Gradient with respect to \(\beta\):
Where:
\(\mathcal{L}\) is the loss function.
\(\mathbf{1}\) is a tensor of ones with the same shape as \(\mathbf{y}\).
\(\mu_i\) and \(\sigma^2_i\) are the mean and variance for the \(i\)-th instance and channel.
These gradients ensure that the normalization process adjusts the parameters to minimize the loss function effectively.
Examples¶
Using `InstanceNorm2d` with a simple input tensor:
>>> import lucid.nn as nn
>>> from lucid import Tensor
>>> input_tensor = Tensor([[
... [[1.0, 2.0, 3.0],
... [4.0, 5.0, 6.0]],
... [[7.0, 8.0, 9.0],
... [10.0, 11.0, 12.0]]
... ]], requires_grad=True) # Shape: (1, 2, 2, 3)
>>> instance_norm = nn.InstanceNorm2d(num_features=2)
>>> output = instance_norm(input_tensor) # Shape: (1, 2, 2, 3)
>>> print(output)
Tensor([[
[[-1.2247, 0.0, 1.2247],
[-1.2247, 0.0, 1.2247]],
[[-1.2247, 0.0, 1.2247],
[-1.2247, 0.0, 1.2247]]
]], grad=None)
# Backpropagation
>>> output.backward()
>>> print(input_tensor.grad)
[[
[[-0.6124, 0.0, 0.6124],
[-0.6124, 0.0, 0.6124]],
[[-0.6124, 0.0, 0.6124],
[-0.6124, 0.0, 0.6124]]
]] # Gradients with respect to input_tensor