class

Conv3d

extendsModule

Conv3d(in_channels: int, out_channels: int, kernel_size: _Size3d, stride: _Size3d = 1, padding: _Size3d | str = 0, dilation: _Size3d = 1, groups: int = 1, bias: bool = True, padding_mode: PaddingMode = 'zeros', device: DeviceLike = None, dtype: DTypeLike = None)

source edit

Implementing kernel

C++ConvNdBackwardclass

Applies a 3D convolution over volumetric data (e.g. video or medical scans).

Computes the 3D cross-correlation of the input with a bank of learnable 3D filters:

y[n, c_{\text{out}}, d, h, w] = \sum_{c_{\text{in}}=0}^{C_{\text{in}}/g - 1} \sum_{k_d, k_h, k_w} x\!\left[n,\; c_{\text{in}},\; d \cdot s_d + k_d \cdot d_d,\; h \cdot s_h + k_h \cdot d_h,\; w \cdot s_w + k_w \cdot d_w\right] \cdot W\!\left[c_{\text{out}},\; c_{\text{in}},\; k_d,\; k_h,\; k_w\right] + b\!\left[c_{\text{out}}\right]

Parameters

in_channelsint

Number of channels in the input volume.

out_channelsint

Number of channels produced by the convolution.

kernel_sizeint or tuple[int, int, int]

Size of the 3D convolving kernel (K_D, K_H, K_W). A single int is broadcast to all three dimensions.

strideint or tuple[int, int, int]= 1

Stride along each spatial dimension. Default: 1.

paddingint, tuple[int, int, int], or str= 0

Zero-padding added on both sides along each spatial dimension. Accepts "same" (requires stride=1) or "valid". Default: 0.

dilationint or tuple[int, int, int]= 1

Spacing between kernel elements. Default: 1.

groupsint= 1

Number of blocked connections. groups = in_channels gives depthwise 3D convolution. Default: 1.

biasbool= True

If True, adds a learnable bias. Default: True.

padding_modestr= 'zeros'

"zeros", "reflect", "replicate", or "circular". Default: "zeros".

deviceDeviceLike= None

Device on which to allocate parameters. Default: None.

dtypeDTypeLike= None

Data type for the parameters. Default: None.

Attributes

weightParameter

Learnable filter tensor of shape (out_channels, in_channels // groups, K_D, K_H, K_W). Initialized with Kaiming uniform:

\text{fan\_in} = \frac{C_{\text{in}}}{g} \cdot K_D \cdot K_H \cdot K_W, \quad W \sim \mathcal{U}\!\left[ -\sqrt{\tfrac{6}{\text{fan\_in}}},\; \sqrt{\tfrac{6}{\text{fan\_in}}} \right]

biasParameter or None

Learnable bias of shape (out_channels,), or None.

Notes

Input: $(N, C_{\text{in}}, D, H, W)$ Output: $(N, C_{\text{out}}, D_{\text{out}}, H_{\text{out}}, W_{\text{out}})$ where

X_{\text{out}} = \left\lfloor \frac{X + 2p_x - d_x(K_X - 1) - 1}{s_x} + 1 \right\rfloor \quad \text{for } X \in \{D, H, W\}

Typical use cases. Conv3d is the standard building block for video understanding (3D ResNets, SlowFast), medical image analysis (CT/MRI volumetric segmentation), and point-cloud processing. It is computationally heavier than Conv2d by a factor of roughly $K_D$ per layer; factorised (2+1)D convolutions are a common approximation.

Memory. A single 3D feature map can be large; consider groups > 1 or smaller kernel_size when memory is a concern.

Examples

Basic volumetric convolution:
>>> import lucid
>>> import lucid.nn as nn
>>> conv3 = nn.Conv3d(in_channels=1, out_channels=16,
...                   kernel_size=3, padding=1)
>>> x = lucid.zeros(2, 1, 16, 32, 32)   # (N, C, D, H, W)
>>> y = conv3(x)
>>> y.shape
(2, 16, 16, 32, 32)
Strided 3D convolution for spatial downsampling:
>>> import lucid
>>> import lucid.nn as nn
>>> conv3_stride = nn.Conv3d(16, 32, kernel_size=3, stride=2, padding=1)
>>> x = lucid.zeros(2, 16, 16, 32, 32)
>>> y = conv3_stride(x)
>>> y.shape
(2, 32, 8, 16, 16)

Used by 2

Constructors

dunder

init

→None

__init__(in_channels: int, out_channels: int, kernel_size: _Size3d, stride: _Size3d = 1, padding: _Size3d | str = 0, dilation: _Size3d = 1, groups: int = 1, bias: bool = True, padding_mode: PaddingMode = 'zeros', device: DeviceLike = None, dtype: DTypeLike = None)

source edit

Initialise the Conv3d module. See the class docstring for parameter semantics.

Instance methods

extra_repr

→str

extra_repr()

source edit

Return a string representation of the layer's configuration.

forward

→Tensor

forward(x: Tensor)

source edit

Apply the convolution to the input tensor.

Parameters

inputTensor

Input tensor of shape

(N, C_{\text{in}}, *)

Returns

Tensor

Output tensor of shape $(N, C_{\text{out}}, *)$ with spatial dimensions determined by stride, padding, dilation, and kernel size.

Conv3d(in_channels: int, out_channels: int, kernel_size: _Size3d, stride: _Size3d = 1, padding: _Size3d | str = 0, dilation: _Size3d = 1, groups: int = 1, bias: bool = True, padding_mode: PaddingMode = 'zeros', device: DeviceLike = None, dtype: DTypeLike = None)

Basic volumetric convolution: >>> import lucid >>> import lucid.nn as nn >>> conv3 = nn.Conv3d(in_channels=1, out_channels=16, ... kernel_size=3, padding=1) >>> x = lucid.zeros(2, 1, 16, 32, 32) # (N, C, D, H, W) >>> y = conv3(x) >>> y.shape (2, 16, 16, 32, 32) Strided 3D convolution for spatial downsampling: >>> import lucid >>> import lucid.nn as nn >>> conv3_stride = nn.Conv3d(16, 32, kernel_size=3, stride=2, padding=1) >>> x = lucid.zeros(2, 16, 16, 32, 32) >>> y = conv3_stride(x) >>> y.shape (2, 32, 8, 16, 16)

__init__(in_channels: int, out_channels: int, kernel_size: _Size3d, stride: _Size3d = 1, padding: _Size3d | str = 0, dilation: _Size3d = 1, groups: int = 1, bias: bool = True, padding_mode: PaddingMode = 'zeros', device: DeviceLike = None, dtype: DTypeLike = None)