fn

conv_transpose3d

Tensor
conv_transpose3d(x: Tensor, weight: Tensor, bias: Tensor | None = None, stride: int | tuple[int, int, int] = 1, padding: int | tuple[int, int, int] = 0, output_padding: int | tuple[int, int, int] = 0, groups: int = 1, dilation: int | tuple[int, int, int] = 1)
source

Transposed 3-D convolution — volumetric upsampling.

Extends conv_transpose2d to three spatial dimensions. Standard in 3-D segmentation decoders (V-Net, 3D-UNet), volumetric GANs, and video generation. Like the 1-D / 2-D variants, this is the matrix-transpose of an ordinary 3-D convolution rather than a true mathematical inverse.

Parameters

xTensor
Input of shape (N, C_in, D_in, H_in, W_in).
weightTensor
Filters of shape (C_in, C_out/groups, kD, kH, kW).
biasTensor= None
Per-output-channel bias of shape (C_out,). Materialised as zeros when None.
strideint or (int, int, int)= 1
Upsampling factor per spatial axis.
paddingint or (int, int, int)= 0
Implicit zero-padding subtracted from each side of the output.
output_paddingint or (int, int, int)= 0
Trailing one-sided output padding.
groupsint= 1
Channel grouping.
dilationint or (int, int, int)= 1
Spacing between kernel taps.

Returns

Tensor

Output of shape (N, C_out, D_out, H_out, W_out) where each spatial size obeys

Dout=(Din1)sD2pD+dD(kD1)+opD+1D_{\text{out}} = (D_{\text{in}} - 1) \cdot s_D - 2 p_D + d_D (k_D - 1) + \text{op}_D + 1

(analogously for H and W).

Notes

Memory cost scales cubically in the spatial dimensions; for large volumes a typical pattern is to interleave trilinear upsample with conv3d instead. Checkerboard artifacts also extend into three dimensions when kernel_size is not a multiple of stride.

Examples

>>> import lucid
>>> from lucid.nn.functional import conv_transpose3d
>>> x = lucid.randn(1, 8, 4, 8, 8)
>>> w = lucid.randn(8, 4, 2, 2, 2)
>>> y = conv_transpose3d(x, w, stride=2)
>>> y.shape
(1, 4, 8, 16, 16)