avg_pool3d

→Tensor

avg_pool3d(x: Tensor, kernel_size: int | tuple[int, int, int], stride: int | tuple[int, int, int] | None = None, padding: int | tuple[int, int, int] = 0, ceil_mode: bool = False, count_include_pad: bool = True, divisor_override: int | None = None)

source edit

Implementing kernel

3-D average pooling over a sliding window.

Volumetric counterpart of avg_pool2d — replaces each 3-D window with its arithmetic mean. Useful in 3-D CNNs as a smoother alternative to max_pool3d, and as the final "global" reduction before a classifier head.

Parameters

xTensor

Input of shape (N, C, D, H, W).

kernel_sizeint or (int, int, int)

Size of the pooling window per axis.

strideint or (int, int, int)= None

Window step. Defaults to kernel_size.

paddingint or (int, int, int)= 0

Implicit zero-padding on each spatial side.

ceil_modebool= False

Use ceil instead of floor in the output-size formula.

count_include_padbool= True

Whether padding cells contribute to the denominator.

divisor_overrideint= None

Explicit denominator overriding |R|.

Returns

Tensor

Output of shape (N, C, D_out, H_out, W_out) where each spatial dim obeys

D_{\text{out}} = \left\lfloor \frac{D + 2 p_D - k_D}{s_D} + 1 \right\rfloor

Notes

Math:

y_{i,c,d,h,w} = \frac{1}{|R|} \sum_{(l, m, n) \in R} x_{i,\,c,\,s_D d + l,\,s_H h + m,\,s_W w + n}

Average-pool's gradient is uniformly distributed across the window, so it produces smoother backward signals than its max-based counterpart.

Examples

>>> import lucid
>>> from lucid.nn.functional import avg_pool3d
>>> x = lucid.randn(1, 4, 8, 16, 16)
>>> y = avg_pool3d(x, kernel_size=2)
>>> y.shape
(1, 4, 4, 8, 8)

Used by 2