AlexNet¶
ConvNet
- class lucid.models.AlexNet(config: AlexNetConfig)¶
The AlexNet module in lucid.models implements the AlexNet architecture, a convolutional neural network designed for image classification tasks. It consists of multiple convolutional and fully connected layers with ReLU activations and dropout for regularization, and it is configured through AlexNetConfig.
%%{init: {"flowchart":{"curve":"monotoneX","nodeSpacing":50,"rankSpacing":50},"themeCSS":".nodeLabel, .edgeLabel, .cluster text, .node text { fill: #000000 !important; } .node foreignObject *, .cluster foreignObject * { color: #000000 !important; }"} }%%
flowchart LR
linkStyle default stroke-width:2.0px
subgraph sg_m0["<span style='font-size:20px;font-weight:700'>alexnet</span>"]
style sg_m0 fill:#000000,fill-opacity:0.05,stroke:#000000,stroke-opacity:0.75,stroke-width:1px
subgraph sg_m1["conv"]
style sg_m1 fill:#000000,fill-opacity:0.05,stroke:#000000,stroke-opacity:0.75,stroke-width:1px
m2["Conv2d<br/><span style='font-size:11px;font-weight:400'>(1,3,224,224) → (1,64,55,55)</span>"];
m3["ReLU"];
m4["MaxPool2d<br/><span style='font-size:11px;font-weight:400'>(1,64,55,55) → (1,64,27,27)</span>"];
m5["Conv2d<br/><span style='font-size:11px;font-weight:400'>(1,64,27,27) → (1,192,27,27)</span>"];
m6["ReLU"];
m7["MaxPool2d<br/><span style='font-size:11px;font-weight:400'>(1,192,27,27) → (1,192,13,13)</span>"];
m8["Conv2d<br/><span style='font-size:11px;font-weight:400'>(1,192,13,13) → (1,384,13,13)</span>"];
m9["ReLU"];
m10["Conv2d<br/><span style='font-size:11px;font-weight:400'>(1,384,13,13) → (1,256,13,13)</span>"];
m11["ReLU"];
m12["Conv2d"];
m13["ReLU"];
m14["MaxPool2d<br/><span style='font-size:11px;font-weight:400'>(1,256,13,13) → (1,256,6,6)</span>"];
end
m15["AdaptiveAvgPool2d"];
subgraph sg_m16["fc"]
style sg_m16 fill:#000000,fill-opacity:0.05,stroke:#000000,stroke-opacity:0.75,stroke-width:1px
m17["Dropout"];
m18["Linear<br/><span style='font-size:11px;font-weight:400'>(1,9216) → (1,4096)</span>"];
m19["ReLU"];
m20["Dropout"];
m21["Linear"];
m22["ReLU"];
m23["Linear<br/><span style='font-size:11px;font-weight:400'>(1,4096) → (1,1000)</span>"];
end
end
input["Input<br/><span style='font-size:11px;color:#000000;font-weight:400'>(1,3,224,224)</span>"];
output["Output<br/><span style='font-size:11px;color:#000000;font-weight:400'>(1,1000)</span>"];
style input fill:#fff3cd,stroke:#a67c00,stroke-width:1px;
style output fill:#fff3cd,stroke:#a67c00,stroke-width:1px;
style m2 fill:#ffe8e8,stroke:#c53030,stroke-width:1px;
style m3 fill:#faf5ff,stroke:#6b46c1,stroke-width:1px;
style m4 fill:#fefcbf,stroke:#b7791f,stroke-width:1px;
style m5 fill:#ffe8e8,stroke:#c53030,stroke-width:1px;
style m6 fill:#faf5ff,stroke:#6b46c1,stroke-width:1px;
style m7 fill:#fefcbf,stroke:#b7791f,stroke-width:1px;
style m8 fill:#ffe8e8,stroke:#c53030,stroke-width:1px;
style m9 fill:#faf5ff,stroke:#6b46c1,stroke-width:1px;
style m10 fill:#ffe8e8,stroke:#c53030,stroke-width:1px;
style m11 fill:#faf5ff,stroke:#6b46c1,stroke-width:1px;
style m12 fill:#ffe8e8,stroke:#c53030,stroke-width:1px;
style m13 fill:#faf5ff,stroke:#6b46c1,stroke-width:1px;
style m14 fill:#fefcbf,stroke:#b7791f,stroke-width:1px;
style m15 fill:#fefcbf,stroke:#b7791f,stroke-width:1px;
style m17 fill:#edf2f7,stroke:#4a5568,stroke-width:1px;
style m18 fill:#ebf8ff,stroke:#2b6cb0,stroke-width:1px;
style m19 fill:#faf5ff,stroke:#6b46c1,stroke-width:1px;
style m20 fill:#edf2f7,stroke:#4a5568,stroke-width:1px;
style m21 fill:#ebf8ff,stroke:#2b6cb0,stroke-width:1px;
style m22 fill:#faf5ff,stroke:#6b46c1,stroke-width:1px;
style m23 fill:#ebf8ff,stroke:#2b6cb0,stroke-width:1px;
input --> m2;
m10 --> m11;
m11 --> m12;
m12 --> m13;
m13 --> m14;
m14 --> m15;
m15 --> m17;
m17 --> m18;
m18 --> m19;
m19 --> m20;
m2 --> m3;
m20 --> m21;
m21 --> m22;
m22 --> m23;
m23 --> output;
m3 --> m4;
m4 --> m5;
m5 --> m6;
m6 --> m7;
m7 --> m8;
m8 --> m9;
m9 --> m10;
Class Signature¶
class AlexNet(nn.Module):
def __init__(self, config: AlexNetConfig)
Parameters¶
config (AlexNetConfig): A configuration object describing the output class count, input channels, dropout rate, and classifier hidden dimensions.
Attributes¶
config (AlexNetConfig): The configuration used to build the model.
conv (nn.Sequential): The convolutional layers, including pooling and ReLU activations.
avgpool (nn.AdaptiveAvgPool2d): Adaptive average pooling layer that reduces the spatial dimensions to (6, 6).
fc (nn.Sequential): The fully connected layers with dropout and ReLU activations for classification.
Architecture¶
The architecture of AlexNet is as follows:
Convolutional Layers: - 5 convolutional layers with ReLU activations. - MaxPooling after the 1st, 2nd, and 5th convolutional layers.
Fully Connected Layers: - 2 hidden fully connected layers, each with 4096 units and ReLU activations. - Output layer with num_classes units for classification.
Regularization: - Dropout is applied to fully connected layers to reduce overfitting.
Examples¶
Basic Example
import lucid.models as models
config = models.AlexNetConfig()
model = models.AlexNet(config)
# Input tensor with shape (1, 3, 224, 224)
input_ = Tensor.randn(1, 3, 224, 224)
# Perform forward pass
output = model(input_)
print(output.shape) # Shape: (1, 1000)
Explanation
The model processes the input through its convolutional and fully connected layers, producing logits for 1000 classes.
Custom Number of Classes
config = models.AlexNetConfig(
num_classes=10,
in_channels=1,
dropout=0.25,
classifier_hidden_features=(512, 256),
)
model = models.AlexNet(config)
input_ = Tensor.randn(1, 1, 224, 224)
output = model(input_)
print(output.shape) # Shape: (1, 10)