util.ROIAlign¶
- class lucid.models.objdet.util.ROIAlign(output_size: tuple[int, int])¶
The ROIAlign module performs Region of Interest (RoI) Align, which extracts fixed-size feature maps from input feature tensors based on given bounding box regions. Unlike RoI Pooling, RoI Align avoids quantization artifacts using bilinear interpolation.
Constructor¶
def __init__(self, output_size: tuple[int, int]) -> None
Parameters¶
output_size (tuple[int, int]): The target spatial size \((H_{\text{out}}, W_{\text{out}})\) for each cropped region.
Returns¶
ROIAlign (nn.Module): A module that, when called, takes a feature tensor and region boxes and returns aligned feature crops.
Input & Output¶
def forward(
features: Tensor,
rois: Tensor,
spatial_scale: float = 1.0,
sampling_ratio: int = -1
) -> Tensor
features (Tensor): Input feature map of shape \((N, C, H, W)\).
rois (Tensor): Boxes of shape \((K, 5)\) where each row is \((batch_idx, x_1, y_1, x_2, y_2)\).
spatial_scale (float, optional): Scale factor applied to RoI coordinates to match the input feature map size. Defaults to 1.0.
sampling_ratio (int, optional): Number of sampling points in the interpolation grid. -1 means automatic. Defaults to -1.
Returns:
Tensor: A tensor of shape \((K, C, H_{\text{out}}, W_{\text{out}})\) containing the aligned region features.
Note
Gradient is preserved through bilinear interpolation.
Batch index must be included in the RoI tensor.
Example¶
>>> from lucid.models.objdet.util import ROIAlign
>>> roi_align = ROIAlign(output_size=(7, 7))
>>> features = lucid.random.randn(1, 256, 32, 32)
>>> rois = lucid.Tensor([[0, 4, 4, 24, 24]])
>>> crops = roi_align(features, rois)
>>> print(crops.shape)
(1, 256, 7, 7)