fn

avg_pool1d

Tensor
avg_pool1d(x: Tensor, kernel_size: int | tuple[int, ...], stride: int | tuple[int, ...] | None = None, padding: int | tuple[int, ...] = 0, ceil_mode: bool = False, count_include_pad: bool = True)
source

1-D average pooling over a sliding window.

Replaces each window with its arithmetic mean. Smoother than max-pool and used wherever maintaining the overall response magnitude matters (e.g. before a fully connected classifier head).

Parameters

xTensor
Input of shape (N, C, L).
kernel_sizeint or tuple of int
Size of the pooling window.
strideint or tuple of int= None
Window step. Defaults to kernel_size (non-overlapping).
paddingint or tuple of int= 0
Implicit zero-padding on both sides.
ceil_modebool= False
Use ceil instead of floor in the output-size formula.
count_include_padbool= True
When True, padding cells contribute to the denominator.

Returns

Tensor

Output of shape (N, C, L_out) where

Lout=L+2pks+1L_{\text{out}} = \left\lfloor \frac{L + 2 p - k}{s} + 1 \right\rfloor

Notes

Math (averaging window RR of cardinality R|R|):

yi,c,l=1RmRxi,c,sl+mpy_{i,c,l} = \frac{1}{|R|} \sum_{m \in R} x_{i,\,c,\,s l + m - p}

Unlike max-pool, average-pool is fully differentiable everywhere, so the gradient is uniformly distributed back across all window positions.

Examples

>>> import lucid
>>> from lucid.nn.functional import avg_pool1d
>>> x = lucid.randn(2, 3, 20)
>>> y = avg_pool1d(x, kernel_size=4, stride=2)
>>> y.shape
(2, 3, 9)