SinusoidalEmbedding2D
ModuleSinusoidalEmbedding2D(height: int, width: int, embedding_dim: int, base: float = 10000.0)Fixed 2-D sinusoidal positional encoding (DETR §A.4 / Carion 2020).
Encodes spatial (row, column) coordinates: first half of the
embedding dim encodes the column index, second half the row index. Used
in DETR and reusable for any image transformer that wants to inject
absolute spatial position without learnable parameters.
Args:
height: Feature-map height H.
width: Feature-map width W.
embedding_dim: Per-position embedding size; must be divisible by 4.
base: Frequency base (DETR uses 10000).
Forward:
Returns the (H * W, embedding_dim) table in row-major order.
Notes
The table is built once in __init__ and registered as a
non-persistent buffer, so it travels with .to(device=...) but is
omitted from state_dict. Use the module form when (H, W) is
fixed across forward passes (typical for DETR-style decoders fed by a
fixed-size CNN feature map); fall back to the functional
lucid.nn.functional.sinusoidal_embedding_2d when the spatial
grid varies per batch. See that function for the row / column
encoding split.
Examples
>>> import lucid.nn as nn
>>> pe = nn.SinusoidalEmbedding2D(height=16, width=24, embedding_dim=128)
>>> pe().shape
(384, 128)