fn

max_pool2d

Tensor
max_pool2d(x: Tensor, kernel_size: int | tuple[int, int], stride: int | tuple[int, int] | None = None, padding: int | tuple[int, int] = 0, dilation: int | tuple[int, int] = 1, return_indices: bool = False, ceil_mode: bool = False)
source

2-D max pooling over a sliding window.

Aggregates each spatial neighbourhood into its maximum value. The canonical downsampling primitive in classical image CNNs — provides invariance to small local shifts and reduces both spatial extent and compute for subsequent layers.

Parameters

xTensor
Input of shape (N, C, H, W).
kernel_sizeint or (int, int)
Size of the pooling window.
strideint or (int, int)= None
Window step. Defaults to kernel_size (non-overlapping).
paddingint or (int, int)= 0
Implicit zero-padding on each spatial side.
dilationint or (int, int)= 1
Spacing between window elements. Default 1.
return_indicesbool= False
Must currently be False — see max_pool1d.
ceil_modebool= False
Use ceil instead of floor in the output-size formula.

Returns

Tensor

Output of shape (N, C, H_out, W_out) where each dim obeys

Hout=H+2pHdH(kH1)1sH+1H_{\text{out}} = \left\lfloor \frac{H + 2 p_H - d_H (k_H - 1) - 1}{s_H} + 1 \right\rfloor

Notes

Math:

yi,c,h,w=maxm,nxi,c,sHh+m,sWw+ny_{i,c,h,w} = \max_{m,n} x_{i,\,c,\,s_H h + m,\,s_W w + n}

Pure max is sub-differentiable; gradient routes to the per-window argmax position only. Empirically the strongest pooling operator for classification tasks (preserves high-magnitude features).

Examples

>>> import lucid
>>> from lucid.nn.functional import max_pool2d
>>> x = lucid.randn(1, 16, 32, 32)
>>> y = max_pool2d(x, kernel_size=2, stride=2)
>>> y.shape
(1, 16, 16, 16)