util.ROIAlign

class lucid.models.objdet.util.ROIAlign(output_size: tuple[int, int])

The ROIAlign module performs Region of Interest (RoI) Align, which extracts fixed-size feature maps from input feature tensors based on given bounding box regions. Unlike RoI Pooling, RoI Align avoids quantization artifacts using bilinear interpolation.

Constructor

def __init__(self, output_size: tuple[int, int]) -> None

Parameters

  • output_size (tuple[int, int]): The target spatial size \((H_{\text{out}}, W_{\text{out}})\) for each cropped region.

Returns

  • ROIAlign (nn.Module): A module that, when called, takes a feature tensor and region boxes and returns aligned feature crops.

Input & Output

def forward(
    features: Tensor,
    rois: Tensor,
    spatial_scale: float = 1.0,
    sampling_ratio: int = -1
) -> Tensor
  • features (Tensor): Input feature map of shape \((N, C, H, W)\).

  • rois (Tensor): Boxes of shape \((K, 5)\) where each row is \((batch_idx, x_1, y_1, x_2, y_2)\).

  • spatial_scale (float, optional): Scale factor applied to RoI coordinates to match the input feature map size. Defaults to 1.0.

  • sampling_ratio (int, optional): Number of sampling points in the interpolation grid. -1 means automatic. Defaults to -1.

Returns:

  • Tensor: A tensor of shape \((K, C, H_{\text{out}}, W_{\text{out}})\) containing the aligned region features.

Note

  • Gradient is preserved through bilinear interpolation.

  • Batch index must be included in the RoI tensor.

Example

>>> from lucid.models.objdet.util import ROIAlign
>>> roi_align = ROIAlign(output_size=(7, 7))
>>> features = lucid.random.randn(1, 256, 32, 32)
>>> rois = lucid.Tensor([[0, 4, 4, 24, 24]])
>>> crops = roi_align(features, rois)
>>> print(crops.shape)
(1, 256, 7, 7)