U-Net 3D

ConvNet Segmentation ConvNet

class lucid.models.UNet3d(config: UNetConfig)

UNet3d is a configurable 3D encoder-decoder segmentation model with skip connections between encoder and decoder stages. It shares the same lucid.models.UNetConfig interface as lucid.models.UNet2d but operates on volumetric inputs \((N, C, D, H, W)\), replacing all 2D convolutions, normalizations, pooling layers, and upsampling operations with their 3D counterparts.

Typical use cases include medical image segmentation (CT/MRI volumes), video feature extraction, and any task requiring dense prediction over 3D spatial data.

Class Signature

class UNet3d(nn.Module):
    def __init__(self, config: UNetConfig) -> None

Parameters

  • config (UNetConfig): Model configuration describing the encoder/decoder stage layout, block type, skip merge behavior, sampling strategy, and segmentation output space. When upsample_mode=”bilinear” is passed, it is automatically remapped to “trilinear” for 3D compatibility.

Methods

UNet3d.forward(x: Tensor) Tensor | dict[str, Tensor | list[Tensor]]

Examples

Build a Basic U-Net 3D

import lucid
import lucid.models as models

cfg = models.UNetConfig.from_channels(
    in_channels=1,
    out_channels=2,
    channels=(32, 64, 128, 256),
    num_blocks=2,
)
model = models.UNet3d(cfg)

# (batch, channels, depth, height, width)
x = lucid.random.randn(1, 1, 64, 128, 128)
logits = model(x)
print(logits.shape)  # (1, 2, 64, 128, 128)

Build a U-Net 3D with Trilinear Upsampling and Deep Supervision

import lucid
import lucid.models as models

cfg = models.UNetConfig(
    in_channels=1,
    out_channels=3,
    encoder_stages=[
        models.UNetStageConfig(channels=32, num_blocks=2),
        models.UNetStageConfig(channels=64, num_blocks=2, use_attention=True),
        models.UNetStageConfig(channels=128, num_blocks=3),
        models.UNetStageConfig(channels=256, num_blocks=3),
    ],
    block="res",
    norm="group",
    act="silu",
    upsample_mode="trilinear",
    deep_supervision=True,
)
model = models.UNet3d(cfg)

x = lucid.random.randn(1, 1, 32, 64, 64)
out = model(x)
print(out["out"].shape)  # (1, 3, 32, 64, 64)
print(len(out["aux"]))   # 2

Notes

  • UNet3d expects volumetric tensors with shape \((N, C, D, H, W)\). For 2D image inputs \((N, C, H, W)\), use lucid.models.UNet2d.

  • Passing upsample_mode=”bilinear” is accepted and silently remapped to “trilinear”, enabling existing 2D configs to be reused for 3D models.

  • The current implementation supports block=”basic” and block=”res”.

  • When deep_supervision=False, lucid.models.UNet3d.forward() returns a single output tensor. When enabled, it returns a dictionary with out and aux predictions, identical to the 2D variant.