nn.AdaptiveAvgPool3d¶
- class lucid.nn.AdaptiveAvgPool3d(output_size: int | tuple[int, int, int])¶
The AdaptiveAvgPool3d performs adaptive average pooling on a 3D input tensor, dynamically determining kernel size, stride, and padding to produce a specified output size.
Class Signature¶
class AdaptiveAvgPool3d(nn.Module):
def __init__(self, output_size: tuple[int, int, int] | int)
Parameters¶
output_size (tuple[int, int, int] | int): The desired output size \((D_{out}, H_{out}, W_{out})\). If an integer is provided, the same size is used for all three dimensions.
Attributes¶
output_size (tuple[int, int, int] | int): Stores the target output size for the pooling operation.
Forward Calculation¶
The AdaptiveAvgPool3d module applies adaptive average pooling on a 3D input tensor of shape \((N, C, D, H, W)\), producing an output tensor of shape \((N, C, D_{out}, H_{out}, W_{out})\). The kernel size, stride, and padding are dynamically determined based on the input size \(D, H, W\) and the desired output size \(D_{out}, H_{out}, W_{out}\).
Behavior¶
The forward calculation for the module follows the same logic as the functional API (adaptive_avg_pool3d). It dynamically adjusts pooling parameters to ensure the target output size is achieved.
Examples¶
Basic Example
import lucid.nn as nn
# Input tensor with shape (1, 3, 16, 16, 16)
input_ = Tensor([[[[[1.0] * 16] * 16] * 16] * 3])
# AdaptiveAvgPool3d module with output size (4, 4, 4)
pool = nn.AdaptiveAvgPool3d(output_size=(4, 4, 4))
output = pool(input_)
print(output) # Shape: (1, 3, 4, 4, 4)
Output Explanation
The input tensor is adaptively pooled to produce a tensor with depth, height, and width of 4, averaging over evenly spaced regions.
Advanced Example with Variable Batch Size
# Input tensor with batch size 2
input_ = Tensor([
[[[[1.0] * 8] * 8] * 8],
[[[[2.0] * 12] * 12] * 12]
])
pool = nn.AdaptiveAvgPool3d(output_size=(2, 2, 2))
output = pool(input_)
print(output) # Shape: (2, 3, 2, 2, 2)
Explanation
The pooling dynamically adjusts for each input’s depth, height, and width, producing a consistent output size of (2, 2, 2) across all samples in the batch.