ResNeXt¶

ConvNet Image Classification

class lucid.models.ResNeXt(block: Module, layers: list[int], cardinality: int, base_width: int, num_classes: int = 1000)¶

The ResNeXt class extends the ResNet architecture by incorporating group convolutions, allowing for an increase in model capacity while maintaining computational efficiency. This is achieved through the use of cardinality, a hyperparameter that specifies the number of groups in convolutions.

Class Signature¶

class lucid.nn.ResNeXt(
    block: nn.Module,
    layers: list[int],
    cardinality: int,
    base_width: int,
    num_classes: int = 1000,
)

Parameters¶

block (nn.Module): The building block module used for the ResNeXt layers. Typically, this is a bottleneck block.
layers (list[int]): Specifies the number of blocks in each stage of the network.
cardinality (int): Number of groups for grouped convolutions. Higher cardinality increases model capacity without significantly increasing computational cost.
base_width (int): The base width of feature channels in each group.
num_classes (int, optional): Number of output classes for the final fully connected layer. Default: 1000.

Attributes¶

layers (list[nn.Module]): A list of stages, each containing a sequence of grouped convolutional blocks.
cardinality (int): Stores the number of groups used in the grouped convolutions.
base_width (int): Stores the base width of the feature maps for each group.

Forward Calculation¶

The forward pass of the ResNeXt model includes:

Stem: Initial convolutional layers for feature extraction.
Grouped Convolution Stages: Each stage applies grouped convolutions based on the cardinality parameter.
Global Pooling: A global average pooling layer reduces spatial dimensions.
Classifier: A fully connected layer maps the features to class scores.

\[\text{output} = \text{FC}(\text{GAP}(\text{GroupedConvBlocks}(\text{Stem}(\text{input}))))\]

Note

The ResNeXt architecture introduces cardinality as an additional dimension to control model capacity.
Increasing the cardinality improves feature learning while maintaining computational efficiency.