ConvNeXt_V2¶
ConvNet
- class lucid.models.ConvNeXt_V2(config: ConvNeXtV2Config)¶
The ConvNeXt_V2 module in lucid.nn builds upon the original ConvNeXt architecture, offering enhanced flexibility and efficiency. It introduces updated configurations for modern image classification tasks, while maintaining the hierarchical design of its predecessor. Model structure is defined through ConvNeXtV2Config.
%%{init: {"flowchart":{"curve":"monotoneX","nodeSpacing":50,"rankSpacing":50}} }%%
flowchart LR
linkStyle default stroke-width:2.0px
subgraph sg_m0["<span style='font-size:20px;font-weight:700'>convnext_v2_base</span>"]
style sg_m0 fill:#000000,fill-opacity:0.05,stroke:#000000,stroke-opacity:0.75,stroke-width:1px
subgraph sg_m1["downsample_layers"]
style sg_m1 fill:#000000,fill-opacity:0.05,stroke:#000000,stroke-opacity:0.75,stroke-width:1px
subgraph sg_m2["Sequential"]
style sg_m2 fill:#000000,fill-opacity:0.05,stroke:#000000,stroke-opacity:0.75,stroke-width:1px
m3["Conv2d<br/><span style='font-size:11px;color:#c53030;font-weight:400'>(1,3,224,224) → (1,128,56,56)</span>"];
m4["_ChannelsFisrtLayerNorm"];
end
subgraph sg_m5["Sequential x 3"]
style sg_m5 fill:#000000,fill-opacity:0.05,stroke:#000000,stroke-opacity:0.75,stroke-width:1px
m5_in(["Input"]);
m5_out(["Output"]);
style m5_in fill:#e2e8f0,stroke:#64748b,stroke-width:1px;
style m5_out fill:#e2e8f0,stroke:#64748b,stroke-width:1px;
m6["_ChannelsFisrtLayerNorm"];
m7["Conv2d<br/><span style='font-size:11px;color:#c53030;font-weight:400'>(1,128,56,56) → (1,256,28,28)</span>"];
end
end
subgraph sg_m8["stages"]
style sg_m8 fill:#000000,fill-opacity:0.05,stroke:#000000,stroke-opacity:0.75,stroke-width:1px
subgraph sg_m9["Sequential x 4"]
style sg_m9 fill:#000000,fill-opacity:0.05,stroke:#000000,stroke-opacity:0.75,stroke-width:1px
m9_in(["Input"]);
m9_out(["Output"]);
style m9_in fill:#e2e8f0,stroke:#64748b,stroke-width:1px;
style m9_out fill:#e2e8f0,stroke:#64748b,stroke-width:1px;
m10(["_Block_V2 x 3"]);
end
end
m11["AdaptiveAvgPool2d<br/><span style='font-size:11px;color:#b7791f;font-weight:400'>(1,1024,7,7) → (1,1024,1,1)</span>"];
m12["LayerNorm"];
m13["Linear<br/><span style='font-size:11px;color:#2b6cb0;font-weight:400'>(1,1024) → (1,1000)</span>"];
end
input["Input<br/><span style='font-size:11px;color:#a67c00;font-weight:400'>(1,3,224,224)</span>"];
output["Output<br/><span style='font-size:11px;color:#a67c00;font-weight:400'>(1,1000)</span>"];
style input fill:#fff3cd,stroke:#a67c00,stroke-width:1px;
style output fill:#fff3cd,stroke:#a67c00,stroke-width:1px;
style m3 fill:#ffe8e8,stroke:#c53030,stroke-width:1px;
style m7 fill:#ffe8e8,stroke:#c53030,stroke-width:1px;
style m11 fill:#fefcbf,stroke:#b7791f,stroke-width:1px;
style m12 fill:#e6fffa,stroke:#2c7a7b,stroke-width:1px;
style m13 fill:#ebf8ff,stroke:#2b6cb0,stroke-width:1px;
input --> m3;
m10 -.-> m6;
m10 --> m9_out;
m11 --> m12;
m12 --> m13;
m13 --> output;
m3 --> m4;
m4 -.-> m10;
m5_in -.-> m6;
m5_out -.-> m9_in;
m6 --> m7;
m7 --> m5_out;
m7 -.-> m9_in;
m9_in -.-> m10;
m9_out --> m11;
m9_out --> m5_in;
Class Signature¶
class ConvNeXt_V2(ConvNeXt):
def __init__(self, config: ConvNeXtV2Config) -> None
Parameters¶
config (ConvNeXtV2Config): Configuration object describing the V2 stage depths, stage widths, classifier size, and drop-path rate.
Examples¶
Basic Example
import lucid.models as models
config = models.ConvNeXtV2Config(num_classes=1000)
model = models.ConvNeXt_V2(config)
# Input tensor with shape (1, 3, 224, 224)
input_ = lucid.random.randn(1, 3, 224, 224)
# Perform forward pass
output = model(input_)
print(output.shape) # Shape: (1, 1000)