ConvNeXt_V2¶

ConvNet

class lucid.models.ConvNeXt_V2(config: ConvNeXtV2Config)¶

The ConvNeXt_V2 module in lucid.nn builds upon the original ConvNeXt architecture, offering enhanced flexibility and efficiency. It introduces updated configurations for modern image classification tasks, while maintaining the hierarchical design of its predecessor. Model structure is defined through ConvNeXtV2Config.

        %%{init: {"flowchart":{"curve":"monotoneX","nodeSpacing":50,"rankSpacing":50}} }%%
flowchart LR
  linkStyle default stroke-width:2.0px
  subgraph sg_m0["<span style='font-size:20px;font-weight:700'>convnext_v2_base</span>"]
  style sg_m0 fill:#000000,fill-opacity:0.05,stroke:#000000,stroke-opacity:0.75,stroke-width:1px
    subgraph sg_m1["downsample_layers"]
    style sg_m1 fill:#000000,fill-opacity:0.05,stroke:#000000,stroke-opacity:0.75,stroke-width:1px
      subgraph sg_m2["Sequential"]
      style sg_m2 fill:#000000,fill-opacity:0.05,stroke:#000000,stroke-opacity:0.75,stroke-width:1px
        m3["Conv2d<br/><span style='font-size:11px;color:#c53030;font-weight:400'>(1,3,224,224) → (1,128,56,56)</span>"];
        m4["_ChannelsFisrtLayerNorm"];
      end
      subgraph sg_m5["Sequential x 3"]
      style sg_m5 fill:#000000,fill-opacity:0.05,stroke:#000000,stroke-opacity:0.75,stroke-width:1px
        m5_in(["Input"]);
        m5_out(["Output"]);
  style m5_in fill:#e2e8f0,stroke:#64748b,stroke-width:1px;
  style m5_out fill:#e2e8f0,stroke:#64748b,stroke-width:1px;
        m6["_ChannelsFisrtLayerNorm"];
        m7["Conv2d<br/><span style='font-size:11px;color:#c53030;font-weight:400'>(1,128,56,56) → (1,256,28,28)</span>"];
      end
    end
    subgraph sg_m8["stages"]
    style sg_m8 fill:#000000,fill-opacity:0.05,stroke:#000000,stroke-opacity:0.75,stroke-width:1px
      subgraph sg_m9["Sequential x 4"]
      style sg_m9 fill:#000000,fill-opacity:0.05,stroke:#000000,stroke-opacity:0.75,stroke-width:1px
        m9_in(["Input"]);
        m9_out(["Output"]);
  style m9_in fill:#e2e8f0,stroke:#64748b,stroke-width:1px;
  style m9_out fill:#e2e8f0,stroke:#64748b,stroke-width:1px;
        m10(["_Block_V2 x 3"]);
      end
    end
    m11["AdaptiveAvgPool2d<br/><span style='font-size:11px;color:#b7791f;font-weight:400'>(1,1024,7,7) → (1,1024,1,1)</span>"];
    m12["LayerNorm"];
    m13["Linear<br/><span style='font-size:11px;color:#2b6cb0;font-weight:400'>(1,1024) → (1,1000)</span>"];
  end
  input["Input<br/><span style='font-size:11px;color:#a67c00;font-weight:400'>(1,3,224,224)</span>"];
  output["Output<br/><span style='font-size:11px;color:#a67c00;font-weight:400'>(1,1000)</span>"];
  style input fill:#fff3cd,stroke:#a67c00,stroke-width:1px;
  style output fill:#fff3cd,stroke:#a67c00,stroke-width:1px;
  style m3 fill:#ffe8e8,stroke:#c53030,stroke-width:1px;
  style m7 fill:#ffe8e8,stroke:#c53030,stroke-width:1px;
  style m11 fill:#fefcbf,stroke:#b7791f,stroke-width:1px;
  style m12 fill:#e6fffa,stroke:#2c7a7b,stroke-width:1px;
  style m13 fill:#ebf8ff,stroke:#2b6cb0,stroke-width:1px;
  input --> m3;
  m10 -.-> m6;
  m10 --> m9_out;
  m11 --> m12;
  m12 --> m13;
  m13 --> output;
  m3 --> m4;
  m4 -.-> m10;
  m5_in -.-> m6;
  m5_out -.-> m9_in;
  m6 --> m7;
  m7 --> m5_out;
  m7 -.-> m9_in;
  m9_in -.-> m10;
  m9_out --> m11;
  m9_out --> m5_in;

Class Signature¶

class ConvNeXt_V2(ConvNeXt):
    def __init__(self, config: ConvNeXtV2Config) -> None

Parameters¶

config (ConvNeXtV2Config): Configuration object describing the V2 stage depths, stage widths, classifier size, and drop-path rate.

Examples¶

Basic Example

import lucid.models as models

config = models.ConvNeXtV2Config(num_classes=1000)
model = models.ConvNeXt_V2(config)

# Input tensor with shape (1, 3, 224, 224)
input_ = lucid.random.randn(1, 3, 224, 224)

# Perform forward pass
output = model(input_)
print(output.shape)  # Shape: (1, 1000)