class

MaxPool3d

extendsModule
MaxPool3d(kernel_size: _Size3d, stride: _Size3d | None = None, padding: _Size3d = 0, dilation: _Size3d = 1, return_indices: bool = False, ceil_mode: bool = False)
source

Applies 3-D max pooling over a volumetric feature map.

Extends MaxPool2d to three spatial dimensions (depth, height, width). For each output position the maximum is taken over a kD×kH×kWk_D \times k_H \times k_W window:

y[n,c,d,h,w]=max0kd<kD0kh<kH0kw<kWx ⁣[n,c,dsD+kd,hsH+kh,wsW+kw]y[n,c,d,h,w] = \max_{\substack{0 \le k_d < k_D \\ 0 \le k_h < k_H \\ 0 \le k_w < k_W}} x\!\left[n,\, c,\, d \cdot s_D + k_d,\, h \cdot s_H + k_h,\, w \cdot s_W + k_w \right]

Output dimensions:

Dout=Din+2pDkDsD+1Hout=Hin+2pHkHsH+1Wout=Win+2pWkWsW+1\begin{aligned} D_{out} &= \left\lfloor \frac{D_{in} + 2p_D - k_D}{s_D} + 1 \right\rfloor \\[4pt] H_{out} &= \left\lfloor \frac{H_{in} + 2p_H - k_H}{s_H} + 1 \right\rfloor \\[4pt] W_{out} &= \left\lfloor \frac{W_{in} + 2p_W - k_W}{s_W} + 1 \right\rfloor \end{aligned}

Parameters

kernel_sizeint or tuple[int, int, int]
Size of the pooling window along (D, H, W).
strideint or tuple[int, int, int] or None= None
Step between windows. Defaults to kernel_size.
paddingint or tuple[int, int, int]= 0
Zero-padding applied to all six faces of the input. Default: 0.
dilationint or tuple[int, int, int]= 1
Spacing between kernel elements. Default: 1.
return_indicesbool= False
Not yet supported. Default: False.
ceil_modebool= False
Use ceiling for output size. Default: False.

Attributes

kernel_sizeint or tuple[int, int, int]
strideint or tuple[int, int, int] or None
paddingint or tuple[int, int, int]
dilationint or tuple[int, int, int]
ceil_modebool

Notes

  • Input: (N, C, D_in, H_in, W_in)
  • Output: (N, C, D_out, H_out, W_out)
  • Typical use cases include video understanding (D = temporal frames) and 3-D medical imaging (CT / MRI volumes).
  • Computational cost is proportional to the product Dout×Hout×Wout×kDkHkWD_{out} \times H_{out} \times W_{out} \times k_D k_H k_W.

Examples

Halving all three spatial dimensions:
>>> import lucid
>>> import lucid.nn as nn
>>> pool = nn.MaxPool3d(kernel_size=2, stride=2)
>>> x = lucid.ones((1, 32, 16, 16, 16))
>>> y = pool(x)
>>> y.shape
(1, 32, 8, 8, 8)
Asymmetric window for video (different temporal/spatial strides):
>>> pool = nn.MaxPool3d(kernel_size=(1, 3, 3), stride=(1, 2, 2))
>>> x = lucid.ones((2, 64, 8, 28, 28))
>>> y = pool(x)
>>> y.shape
(2, 64, 8, 13, 13)

Methods (3)

dunder

__init__

None
__init__(kernel_size: _Size3d, stride: _Size3d | None = None, padding: _Size3d = 0, dilation: _Size3d = 1, return_indices: bool = False, ceil_mode: bool = False)
source

Initialise the MaxPool3d module. See the class docstring for parameter semantics.

fn

forward

Tensor
forward(x: Tensor)
source

Apply the pooling operation to the input tensor.

Parameters

inputTensor
Input tensor of shape (N,C,)(N, C, *) where * are the spatial dimensions appropriate for this pooling layer.

Returns

Tensor

Pooled output tensor.

fn

extra_repr

str
extra_repr()
source

Return a string representation of the layer's configuration.