class

MaxPool3d

extendsModule

MaxPool3d(kernel_size: _Size3d, stride: _Size3d | None = None, padding: _Size3d = 0, dilation: _Size3d = 1, return_indices: bool = False, ceil_mode: bool = False)

source edit

Implementing kernel

C++MaxPoolNdBackwardclass

Applies 3-D max pooling over a volumetric feature map.

Extends MaxPool2d to three spatial dimensions (depth, height, width). For each output position the maximum is taken over a $k_D \times k_H \times k_W$ window:

y[n,c,d,h,w] = \max_{\substack{0 \le k_d < k_D \\ 0 \le k_h < k_H \\ 0 \le k_w < k_W}} x\!\left[n,\, c,\, d \cdot s_D + k_d,\, h \cdot s_H + k_h,\, w \cdot s_W + k_w \right]

Output dimensions:

\begin{aligned} D_{out} &= \left\lfloor \frac{D_{in} + 2p_D - k_D}{s_D} + 1 \right\rfloor \\[4pt] H_{out} &= \left\lfloor \frac{H_{in} + 2p_H - k_H}{s_H} + 1 \right\rfloor \\[4pt] W_{out} &= \left\lfloor \frac{W_{in} + 2p_W - k_W}{s_W} + 1 \right\rfloor \end{aligned}

Parameters

kernel_sizeint or tuple[int, int, int]

Size of the pooling window along (D, H, W).

strideint or tuple[int, int, int] or None= None

Step between windows. Defaults to kernel_size.

paddingint or tuple[int, int, int]= 0

Zero-padding applied to all six faces of the input. Default: 0.

dilationint or tuple[int, int, int]= 1

Spacing between kernel elements. Default: 1.

return_indicesbool= False

Not yet supported. Default: False.

ceil_modebool= False

Use ceiling for output size. Default: False.

Attributes

kernel_sizeint or tuple[int, int, int]

strideint or tuple[int, int, int] or None

paddingint or tuple[int, int, int]

dilationint or tuple[int, int, int]

ceil_modebool

Notes

Input: (N, C, D_in, H_in, W_in)
Output: (N, C, D_out, H_out, W_out)

Typical use cases include video understanding (D = temporal frames) and 3-D medical imaging (CT / MRI volumes).
Computational cost is proportional to the product $D_{out} \times H_{out} \times W_{out} \times k_D k_H k_W$ .

Examples

Halving all three spatial dimensions:
>>> import lucid
>>> import lucid.nn as nn
>>> pool = nn.MaxPool3d(kernel_size=2, stride=2)
>>> x = lucid.ones((1, 32, 16, 16, 16))
>>> y = pool(x)
>>> y.shape
(1, 32, 8, 8, 8)
Asymmetric window for video (different temporal/spatial strides):
>>> pool = nn.MaxPool3d(kernel_size=(1, 3, 3), stride=(1, 2, 2))
>>> x = lucid.ones((2, 64, 8, 28, 28))
>>> y = pool(x)
>>> y.shape
(2, 64, 8, 13, 13)

Used by 1

lucid.nn.modules

Constructors

dunder

init

→None

__init__(kernel_size: _Size3d, stride: _Size3d | None = None, padding: _Size3d = 0, dilation: _Size3d = 1, return_indices: bool = False, ceil_mode: bool = False)

source edit

Initialise the MaxPool3d module. See the class docstring for parameter semantics.

Instance methods

extra_repr

→str

extra_repr()

source edit

Return a string representation of the layer's configuration.

forward

→Tensor

forward(x: Tensor)

source edit

Apply the pooling operation to the input tensor.

Parameters

inputTensor

Input tensor of shape

(N, C, *)

where

*

are the spatial dimensions appropriate for this pooling layer.

Returns

Tensor

Pooled output tensor.

Halving all three spatial dimensions: >>> import lucid >>> import lucid.nn as nn >>> pool = nn.MaxPool3d(kernel_size=2, stride=2) >>> x = lucid.ones((1, 32, 16, 16, 16)) >>> y = pool(x) >>> y.shape (1, 32, 8, 8, 8) Asymmetric window for video (different temporal/spatial strides): >>> pool = nn.MaxPool3d(kernel_size=(1, 3, 3), stride=(1, 2, 2)) >>> x = lucid.ones((2, 64, 8, 28, 28)) >>> y = pool(x) >>> y.shape (2, 64, 8, 13, 13)