Conv3d
ModuleConv3d(in_channels: int, out_channels: int, kernel_size: _Size3d, stride: _Size3d = 1, padding: _Size3d | str = 0, dilation: _Size3d = 1, groups: int = 1, bias: bool = True, padding_mode: str = 'zeros', device: DeviceLike = None, dtype: DTypeLike = None)Applies a 3D convolution over volumetric data (e.g. video or medical scans).
Computes the 3D cross-correlation of the input with a bank of learnable 3D filters:
Parameters
in_channelsintout_channelsintkernel_sizeint or tuple[int, int, int](K_D, K_H, K_W).
A single int is broadcast to all three dimensions.strideint or tuple[int, int, int]= 11.paddingint, tuple[int, int, int], or str= 0"same" (requires stride=1) or "valid".
Default: 0.dilationint or tuple[int, int, int]= 11.groupsint= 1groups = in_channels gives
depthwise 3D convolution. Default: 1.biasbool= TrueTrue, adds a learnable bias. Default: True.padding_modestr= 'zeros'"zeros", "reflect", "replicate", or "circular".
Default: "zeros".deviceDeviceLike= NoneNone.dtypeDTypeLike= NoneNone.Attributes
weightParameter(out_channels, in_channels // groups, K_D, K_H, K_W).
Initialized with Kaiming uniform:
biasParameter or None(out_channels,), or None.Notes
Input: Output: where
$$
X_{\text{out}} = \left\lfloor \frac{X + 2p_x - d_x(K_X - 1) - 1}{s_x} + 1 \right\rfloor \quad \text{for } X \in {D, H, W}
Typical use cases. Conv3d is the standard building block for video understanding (3D ResNets, SlowFast), medical image analysis (CT/MRI volumetric segmentation), and point-cloud processing. It is computationally heavier than Conv2d by a factor of roughly per layer; factorised (2+1)D convolutions are a common approximation.
Memory. A single 3D feature map can be large; consider
groups > 1 or smaller kernel_size when memory is a concern.
Examples
Basic volumetric convolution:
>>> import lucid
>>> import lucid.nn as nn
>>> conv3 = nn.Conv3d(in_channels=1, out_channels=16,
... kernel_size=3, padding=1)
>>> x = lucid.zeros(2, 1, 16, 32, 32) # (N, C, D, H, W)
>>> y = conv3(x)
>>> y.shape
(2, 16, 16, 32, 32)
Strided 3D convolution for spatial downsampling:
>>> import lucid
>>> import lucid.nn as nn
>>> conv3_stride = nn.Conv3d(16, 32, kernel_size=3, stride=2, padding=1)
>>> x = lucid.zeros(2, 16, 16, 32, 32)
>>> y = conv3_stride(x)
>>> y.shape
(2, 32, 8, 16, 16)Methods (3)
__init__
→None__init__(in_channels: int, out_channels: int, kernel_size: _Size3d, stride: _Size3d = 1, padding: _Size3d | str = 0, dilation: _Size3d = 1, groups: int = 1, bias: bool = True, padding_mode: str = 'zeros', device: DeviceLike = None, dtype: DTypeLike = None)Initialise the Conv3d module. See the class docstring for parameter semantics.
forward
→Tensorforward(x: Tensor)Apply the convolution to the input tensor.
Parameters
inputTensorReturns
TensorOutput tensor of shape with spatial dimensions determined by stride, padding, dilation, and kernel size.
extra_repr
→strextra_repr()Return a string representation of the layer's configuration.