class

Fold

extendsModule

Fold(output_size: _Size2d, kernel_size: _Size2d, dilation: _Size2d = 1, padding: _Size2d = 0, stride: _Size2d = 1)

source edit

Combine an array of sliding local blocks back into a batched tensor.

Fold is the inverse of Unfold: it reconstructs a spatial tensor from its column representation produced by the im2col (Unfold) operation. This is also known as col2im.

When multiple blocks overlap a single output position, their contributions are summed (accumulated), not averaged. Use a companion Fold of all-ones to compute the overlap count if you need average pooling semantics.

\text{input:} \quad (N,\, C \cdot k_H \cdot k_W,\, L) \;\longrightarrow\; \text{output:} \quad (N,\, C,\, H_{\text{out}},\, W_{\text{out}})

where $H_{\text{out}}$ and $W_{\text{out}}$ are given by output_size and the relationship

L = \left\lfloor \frac{H_{\text{out}} + 2p_H - d_H(k_H-1) - 1}{s_H} + 1 \right\rfloor \times \left\lfloor \frac{W_{\text{out}} + 2p_W - d_W(k_W-1) - 1}{s_W} + 1 \right\rfloor

must hold consistently with the Unfold parameters used.

Parameters

output_sizeint or tuple[int, int]

Desired spatial shape

(H_{\text{out}}, W_{\text{out}})

of the output tensor (excluding batch and channel dimensions). A single int is broadcast to both dimensions.

kernel_sizeint or tuple[int, int]

Size of the sliding window

(k_H, k_W)

(must match the Unfold that produced the input).

dilationint or tuple[int, int]= 1

Dilation of the kernel

(d_H, d_W)

(default 1).

paddingint or tuple[int, int]= 0

Zero-padding that was applied during Unfold

(p_H, p_W)

(default 0).

strideint or tuple[int, int]= 1

Stride used during Unfold

(s_H, s_W)

(default 1).

Attributes

output_sizetuple[int, int]

Normalised (H_out, W_out) stored as a 2-tuple.

kernel_sizetuple[int, int]

Normalised kernel size stored as a 2-tuple.

dilationtuple[int, int]

Normalised dilation stored as a 2-tuple.

paddingtuple[int, int]

Normalised padding stored as a 2-tuple.

stridetuple[int, int]

Normalised stride stored as a 2-tuple.

Notes

Input: $(N, C \cdot k_H \cdot k_W, L)$ .
Output: $(N, C, H_{\text{out}}, W_{\text{out}})$ .

Overlapping patches accumulate (sum) their contributions; this is the correct adjoint of Unfold for gradient computation.
To reconstruct the average value at each position, fold a tensor of ones with the same parameters and divide element-wise.
Backed by the C++ fold (col2im) op via nn.functional.fold.

Examples

**Round-trip Unfold → Fold (non-overlapping patches):**
>>> import lucid
>>> import lucid.nn as nn
>>>
>>> H, W, kH, kW = 16, 16, 4, 4
>>> unfold = nn.Unfold(kernel_size=(kH, kW), stride=(kH, kW))
>>> fold   = nn.Fold(output_size=(H, W), kernel_size=(kH, kW), stride=(kH, kW))
>>>
>>> x       = lucid.randn(1, 3, H, W)
>>> patches = unfold(x)          # (1, 48, 16)
>>> x_hat   = fold(patches)      # (1, 3, 16, 16) — exact reconstruction
>>>                               # (no overlap → no accumulation artefacts)
**Attention-weighted patch reconstruction (overlapping):**
>>> import lucid
>>> import lucid.nn as nn
>>>
>>> unfold = nn.Unfold(kernel_size=3, padding=1)
>>> fold   = nn.Fold(output_size=(8, 8), kernel_size=3, padding=1)
>>> divisor_fold = nn.Fold(output_size=(8, 8), kernel_size=3, padding=1)
>>>
>>> x    = lucid.randn(1, 2, 8, 8)
>>> cols = unfold(x)                        # (1, 18, 64)
>>> # ... apply per-patch attention weights ...
>>> ones = lucid.ones_like(cols)
>>> out  = fold(cols) / divisor_fold(ones)  # average over overlaps

Used by 1

lucid.nn.modules

Constructors

dunder

init

→None

__init__(output_size: _Size2d, kernel_size: _Size2d, dilation: _Size2d = 1, padding: _Size2d = 0, stride: _Size2d = 1)

source edit

Initialise the Fold module. See the class docstring for parameter semantics.

Instance methods

extra_repr

→str

extra_repr()

source edit

Return a string representation of the layer's configuration.

forward

→Tensor

forward(x: Tensor)

source edit

Flatten (or unflatten) the specified dimensions of the input.

Parameters

inputTensor

Input tensor.

Returns

Tensor

Tensor with the configured dimensions flattened or unflattened.

**Round-trip Unfold → Fold (non-overlapping patches):** >>> import lucid >>> import lucid.nn as nn >>> >>> H, W, kH, kW = 16, 16, 4, 4 >>> unfold = nn.Unfold(kernel_size=(kH, kW), stride=(kH, kW)) >>> fold = nn.Fold(output_size=(H, W), kernel_size=(kH, kW), stride=(kH, kW)) >>> >>> x = lucid.randn(1, 3, H, W) >>> patches = unfold(x) # (1, 48, 16) >>> x_hat = fold(patches) # (1, 3, 16, 16) — exact reconstruction >>> # (no overlap → no accumulation artefacts) **Attention-weighted patch reconstruction (overlapping):** >>> import lucid >>> import lucid.nn as nn >>> >>> unfold = nn.Unfold(kernel_size=3, padding=1) >>> fold = nn.Fold(output_size=(8, 8), kernel_size=3, padding=1) >>> divisor_fold = nn.Fold(output_size=(8, 8), kernel_size=3, padding=1) >>> >>> x = lucid.randn(1, 2, 8, 8) >>> cols = unfold(x) # (1, 18, 64) >>> # ... apply per-patch attention weights ... >>> ones = lucid.ones_like(cols) >>> out = fold(cols) / divisor_fold(ones) # average over overlaps