class

ConvTranspose2d

extendsModule
ConvTranspose2d(in_channels: int, out_channels: int, kernel_size: _Size2d, stride: _Size2d = 1, padding: _Size2d = 0, output_padding: _Size2d = 0, groups: int = 1, bias: bool = True, dilation: _Size2d = 1, device: DeviceLike = None, dtype: DTypeLike = None)
source

Applies a 2D transposed convolution (fractionally-strided convolution).

Also known as a fractionally-strided convolution, this module is commonly used as the spatial upsampling primitive in generative models (VAEs, GANs), dense prediction decoders (U-Net), and super-resolution networks. It is the transpose (adjoint) of Conv2d.

The output spatial dimensions satisfy:

Hout=(Hin1)sh2ph+dh(KH1)+phout+1H_{\text{out}} = (H_{\text{in}} - 1) \cdot s_h - 2p_h + d_h(K_H - 1) + p^{\text{out}}_h + 1
W_{\text{out}} = (W_{\text{in}} - 1) \cdot s_w - 2p_w
                 + d_w(K_W - 1) + p^{\text{out}}_w + 1

Parameters

in_channelsint
Number of channels in the input feature map.
out_channelsint
Number of channels produced by the transposed convolution.
kernel_sizeint or tuple[int, int]
Size of the convolving kernel.
strideint or tuple[int, int]= 1
Stride. Values > 1 upsample the spatial dimensions. Default: 1.
paddingint or tuple[int, int]= 0
dilation * (kernel_size - 1) - padding zero-padding is added to both sides of each spatial dimension. Default: 0.
output_paddingint or tuple[int, int]= 0
Additional size added to one side of each spatial dimension of the output. Must satisfy 0 <= output_padding < max(stride, dilation) along each axis. Default: 0.
groupsint= 1
Number of blocked connections. Default: 1.
biasbool= True
If True, adds a learnable bias. Default: True.
dilationint or tuple[int, int]= 1
Spacing between kernel elements. Default: 1.
deviceDeviceLike= None
Device on which to allocate parameters. Default: None.
dtypeDTypeLike= None
Data type for the parameters. Default: None.

Attributes

weightParameter
Learnable kernel of shape (in_channels, out_channels // groups, K_H, K_W). The leading axis is in_channels — the reverse of Conv2d. Initialized with Kaiming uniform (a=5a = \sqrt{5}).
biasParameter or None
Learnable bias of shape (out_channels,), or None.

Notes

Input: (N,Cin,H,W)(N, C_{\text{in}}, H, W) Output: (N,Cout,Hout,Wout)(N, C_{\text{out}}, H_{\text{out}}, W_{\text{out}}) as given by the formulas above.

Checkerboard artefacts. Transposed convolutions with stride > 1 can produce characteristic checkerboard patterns in the output when kernel size is not divisible by stride. A common mitigation is to use kernel_size = stride * n for some integer n, or to replace the transposed conv with bilinear upsampling followed by a regular convolution.

output_padding. When stride > 1 the output size formula is not injective: multiple input sizes map to the same output size. output_padding resolves this ambiguity and must be set consistently with the encoder stride to reconstruct the exact spatial dimensions.

Examples

VAE decoder: upsample 4×4 latent to 8×8:
>>> import lucid
>>> import lucid.nn as nn
>>> decoder = nn.ConvTranspose2d(
...     in_channels=128, out_channels=64,
...     kernel_size=4, stride=2, padding=1
... )
>>> z = lucid.zeros(8, 128, 4, 4)
>>> y = decoder(z)
>>> y.shape
(8, 64, 8, 8)
U-Net upsampling block:
>>> import lucid
>>> import lucid.nn as nn
>>> up = nn.ConvTranspose2d(256, 128, kernel_size=2, stride=2)
>>> x = lucid.zeros(4, 256, 16, 16)
>>> y = up(x)
>>> y.shape
(4, 128, 32, 32)

Methods (3)

dunder

__init__

None
__init__(in_channels: int, out_channels: int, kernel_size: _Size2d, stride: _Size2d = 1, padding: _Size2d = 0, output_padding: _Size2d = 0, groups: int = 1, bias: bool = True, dilation: _Size2d = 1, device: DeviceLike = None, dtype: DTypeLike = None)
source

Initialise the ConvTranspose2d module. See the class docstring for parameter semantics.

fn

forward

Tensor
forward(x: Tensor)
source

Apply the convolution to the input tensor.

Parameters

inputTensor
Input tensor of shape (N,Cin,)(N, C_{\text{in}}, *).

Returns

Tensor

Output tensor of shape (N,Cout,)(N, C_{\text{out}}, *) with spatial dimensions determined by stride, padding, dilation, and kernel size.

fn

extra_repr

str
extra_repr()
source

Return a string representation of the layer's configuration.