class

ConvTranspose3d

extendsModule
ConvTranspose3d(in_channels: int, out_channels: int, kernel_size: _Size3d, stride: _Size3d = 1, padding: _Size3d = 0, output_padding: _Size3d = 0, groups: int = 1, bias: bool = True, dilation: _Size3d = 1, device: DeviceLike = None, dtype: DTypeLike = None)
source

Applies a 3D transposed convolution (fractionally-strided convolution).

The 3D extension of ConvTranspose2d. It upsamples all three spatial dimensions simultaneously and is used in volumetric decoders such as 3D autoencoders, video generators, and medical image synthesis.

Output size along each spatial axis X{D,H,W}X \in \{D, H, W\}:

Xout=(Xin1)sx2px+dx(KX1)+pxout+1X_{\text{out}} = (X_{\text{in}} - 1) \cdot s_x - 2p_x + d_x(K_X - 1) + p^{\text{out}}_x + 1

Parameters

in_channelsint
Number of channels in the input volume.
out_channelsint
Number of channels produced by the transposed convolution.
kernel_sizeint or tuple[int, int, int]
Size of the 3D convolving kernel.
strideint or tuple[int, int, int]= 1
Stride. Values > 1 upsample the spatial dimensions. Default: 1.
paddingint or tuple[int, int, int]= 0
dilation * (kernel_size - 1) - padding zero-padding applied on both sides of each axis. Default: 0.
output_paddingint or tuple[int, int, int]= 0
Additional size added to one side of each spatial dimension. Default: 0.
groupsint= 1
Number of blocked connections. Default: 1.
biasbool= True
If True, adds a learnable bias. Default: True.
dilationint or tuple[int, int, int]= 1
Spacing between kernel elements. Default: 1.
deviceDeviceLike= None
Device on which to allocate parameters. Default: None.
dtypeDTypeLike= None
Data type for the parameters. Default: None.

Attributes

weightParameter
Learnable kernel of shape (in_channels, out_channels // groups, K_D, K_H, K_W). Leading axis is in_channels (same convention as ConvTranspose2d). Initialized with Kaiming uniform (a=5a = \sqrt{5}).
biasParameter or None
Learnable bias of shape (out_channels,), or None.

Notes

Input: (N,Cin,D,H,W)(N, C_{\text{in}}, D, H, W) Output: (N,Cout,Dout,Hout,Wout)(N, C_{\text{out}}, D_{\text{out}}, H_{\text{out}}, W_{\text{out}}) as given by the formula above.

Memory. 3D transposed convolutions produce large feature maps at decoder stages. Gradient checkpointing or smaller out_channels values are often necessary when operating on high-resolution volumes.

Symmetric decoder design. Pair each Conv3d in the encoder with a ConvTranspose3d having identical kernel_size, stride, and padding in the decoder to guarantee exact shape reconstruction.

Examples

Volumetric upsampling (2× along all spatial axes):
>>> import lucid
>>> import lucid.nn as nn
>>> up3d = nn.ConvTranspose3d(
...     in_channels=64, out_channels=32,
...     kernel_size=4, stride=2, padding=1
... )
>>> x = lucid.zeros(2, 64, 4, 8, 8)
>>> y = up3d(x)
>>> y.shape
(2, 32, 8, 16, 16)
3D autoencoder decoder block:
>>> import lucid
>>> import lucid.nn as nn
>>> decoder3d = nn.ConvTranspose3d(128, 64, kernel_size=3, stride=1, padding=1)
>>> x = lucid.zeros(1, 128, 8, 8, 8)
>>> y = decoder3d(x)
>>> y.shape
(1, 64, 8, 8, 8)

Methods (3)

dunder

__init__

None
__init__(in_channels: int, out_channels: int, kernel_size: _Size3d, stride: _Size3d = 1, padding: _Size3d = 0, output_padding: _Size3d = 0, groups: int = 1, bias: bool = True, dilation: _Size3d = 1, device: DeviceLike = None, dtype: DTypeLike = None)
source

Initialise the ConvTranspose3d module. See the class docstring for parameter semantics.

fn

forward

Tensor
forward(x: Tensor)
source

Apply the convolution to the input tensor.

Parameters

inputTensor
Input tensor of shape (N,Cin,)(N, C_{\text{in}}, *).

Returns

Tensor

Output tensor of shape (N,Cout,)(N, C_{\text{out}}, *) with spatial dimensions determined by stride, padding, dilation, and kernel size.

fn

extra_repr

str
extra_repr()
source

Return a string representation of the layer's configuration.