nn.functional.adaptive_avg_pool3d

lucid.nn.functional.adaptive_avg_pool3d(input_: Tensor, output_size: int | tuple[int, int, int]) Tensor

The adaptive_avg_pool3d performs adaptive average pooling on a 3D input tensor, dynamically determining kernel size, stride, and padding to produce a specified output size.

Function Signature

def adaptive_avg_pool3d(input_: Tensor, output_size: tuple[int, int, int] | int) -> Tensor

Parameters

  • input_ (Tensor): The input tensor of shape \((N, C, D, H, W)\), where \(N\) is the batch size, \(C\) is the number of channels, and \(D\), \(H\), and \(W\) are the depth, height, and width of the input.

  • output_size (tuple[int, int, int] | int): The desired output size \((D_{out}, H_{out}, W_{out})\). If an integer is provided, the same size is used for all three dimensions.

Returns

  • Tensor: The result of adaptive average pooling, with shape \((N, C, D_{out}, H_{out}, W_{out})\).

Behavior

The adaptive_avg_pool3d function computes kernel size, stride, and padding dynamically based on the input size \(D, H, W\) and the target output size \(D_{out}, H_{out}, W_{out}\). The operation averages over the computed kernel regions to produce the output tensor.

Forward Calculation

The formula for the output dimensions \(D_{out}, H_{out}, W_{out}\) is derived as:

\[ \begin{align}\begin{aligned}D_{out} = \frac{D + 2 \cdot \text{padding} - \text{kernel\_size}}{\text{stride}} + 1\\H_{out} = \frac{H + 2 \cdot \text{padding} - \text{kernel\_size}}{\text{stride}} + 1\\W_{out} = \frac{W + 2 \cdot \text{padding} - \text{kernel\_size}}{\text{stride}} + 1\end{aligned}\end{align} \]

where \(\text{padding}\) is computed symmetrically to ensure coverage of the input tensor.

Examples

Basic Example

import lucid.nn.functional as F

# Input tensor with shape (1, 3, 16, 16, 16)
input_ = Tensor([[[[[1.0] * 16] * 16] * 16] * 3])

# Adaptive average pooling to output size (4, 4, 4)
output = F.adaptive_avg_pool3d(input_, output_size=(4, 4, 4))

print(output)  # Shape: (1, 3, 4, 4, 4)

Output Explanation

The input tensor is adaptively pooled to produce a tensor with depth, height, and width of 4, averaging over evenly spaced regions.

Advanced Example with Variable Batch Size

# Input tensor with batch size 2
input_ = Tensor([
    [[[[1.0] * 8] * 8] * 8],
    [[[[2.0] * 12] * 12] * 12]
])

output = F.adaptive_avg_pool3d(input_, output_size=(2, 2, 2))

print(output)  # Shape: (2, 3, 2, 2, 2)

Explanation

The pooling dynamically adjusts for each input’s depth, height, and width, producing a consistent output size of (2, 2, 2) across all samples in the batch.