class

ConvTranspose3d

extendsModule

ConvTranspose3d(in_channels: int, out_channels: int, kernel_size: _Size3d, stride: _Size3d = 1, padding: _Size3d = 0, output_padding: _Size3d = 0, groups: int = 1, bias: bool = True, dilation: _Size3d = 1, device: DeviceLike = None, dtype: DTypeLike = None)

source edit

Implementing kernel

C++ConvTransposeNdBackwardclass

Applies a 3D transposed convolution (fractionally-strided convolution).

The 3D extension of ConvTranspose2d. It upsamples all three spatial dimensions simultaneously and is used in volumetric decoders such as 3D autoencoders, video generators, and medical image synthesis.

Output size along each spatial axis $X \in \{D, H, W\}$ :

X_{\text{out}} = (X_{\text{in}} - 1) \cdot s_x - 2p_x + d_x(K_X - 1) + p^{\text{out}}_x + 1

Parameters

in_channelsint

Number of channels in the input volume.

out_channelsint

Number of channels produced by the transposed convolution.

kernel_sizeint or tuple[int, int, int]

Size of the 3D convolving kernel.

strideint or tuple[int, int, int]= 1

Stride. Values > 1 upsample the spatial dimensions. Default: 1.

paddingint or tuple[int, int, int]= 0

dilation * (kernel_size - 1) - padding zero-padding applied on both sides of each axis. Default: 0.

output_paddingint or tuple[int, int, int]= 0

Additional size added to one side of each spatial dimension. Default: 0.

groupsint= 1

Number of blocked connections. Default: 1.

biasbool= True

If True, adds a learnable bias. Default: True.

dilationint or tuple[int, int, int]= 1

Spacing between kernel elements. Default: 1.

deviceDeviceLike= None

Device on which to allocate parameters. Default: None.

dtypeDTypeLike= None

Data type for the parameters. Default: None.

Attributes

weightParameter

Learnable kernel of shape (in_channels, out_channels // groups, K_D, K_H, K_W). Leading axis is in_channels (same convention as ConvTranspose2d). Initialized with Kaiming uniform (

a = \sqrt{5}

biasParameter or None

Learnable bias of shape (out_channels,), or None.

Notes

Input: $(N, C_{\text{in}}, D, H, W)$ Output: $(N, C_{\text{out}}, D_{\text{out}}, H_{\text{out}}, W_{\text{out}})$ as given by the formula above.

Memory. 3D transposed convolutions produce large feature maps at decoder stages. Gradient checkpointing or smaller out_channels values are often necessary when operating on high-resolution volumes.

Symmetric decoder design. Pair each Conv3d in the encoder with a ConvTranspose3d having identical kernel_size, stride, and padding in the decoder to guarantee exact shape reconstruction.

Examples

Volumetric upsampling (2× along all spatial axes):
>>> import lucid
>>> import lucid.nn as nn
>>> up3d = nn.ConvTranspose3d(
...     in_channels=64, out_channels=32,
...     kernel_size=4, stride=2, padding=1
... )
>>> x = lucid.zeros(2, 64, 4, 8, 8)
>>> y = up3d(x)
>>> y.shape
(2, 32, 8, 16, 16)
3D autoencoder decoder block:
>>> import lucid
>>> import lucid.nn as nn
>>> decoder3d = nn.ConvTranspose3d(128, 64, kernel_size=3, stride=1, padding=1)
>>> x = lucid.zeros(1, 128, 8, 8, 8)
>>> y = decoder3d(x)
>>> y.shape
(1, 64, 8, 8, 8)

Used by 1

lucid.nn.modules

Constructors

dunder

init

→None

__init__(in_channels: int, out_channels: int, kernel_size: _Size3d, stride: _Size3d = 1, padding: _Size3d = 0, output_padding: _Size3d = 0, groups: int = 1, bias: bool = True, dilation: _Size3d = 1, device: DeviceLike = None, dtype: DTypeLike = None)

source edit

Initialise the ConvTranspose3d module. See the class docstring for parameter semantics.

Instance methods

extra_repr

→str

extra_repr()

source edit

Return a string representation of the layer's configuration.

forward

→Tensor

forward(x: Tensor)

source edit

Apply the convolution to the input tensor.

Parameters

inputTensor

Input tensor of shape

(N, C_{\text{in}}, *)

Returns

Tensor

Output tensor of shape $(N, C_{\text{out}}, *)$ with spatial dimensions determined by stride, padding, dilation, and kernel size.

ConvTranspose3d(in_channels: int, out_channels: int, kernel_size: _Size3d, stride: _Size3d = 1, padding: _Size3d = 0, output_padding: _Size3d = 0, groups: int = 1, bias: bool = True, dilation: _Size3d = 1, device: DeviceLike = None, dtype: DTypeLike = None)

Volumetric upsampling (2× along all spatial axes): >>> import lucid >>> import lucid.nn as nn >>> up3d = nn.ConvTranspose3d( ... in_channels=64, out_channels=32, ... kernel_size=4, stride=2, padding=1 ... ) >>> x = lucid.zeros(2, 64, 4, 8, 8) >>> y = up3d(x) >>> y.shape (2, 32, 8, 16, 16) 3D autoencoder decoder block: >>> import lucid >>> import lucid.nn as nn >>> decoder3d = nn.ConvTranspose3d(128, 64, kernel_size=3, stride=1, padding=1) >>> x = lucid.zeros(1, 128, 8, 8, 8) >>> y = decoder3d(x) >>> y.shape (1, 64, 8, 8, 8)

__init__(in_channels: int, out_channels: int, kernel_size: _Size3d, stride: _Size3d = 1, padding: _Size3d = 0, output_padding: _Size3d = 0, groups: int = 1, bias: bool = True, dilation: _Size3d = 1, device: DeviceLike = None, dtype: DTypeLike = None)