class

SinusoidalEmbedding2D

extendsModule
SinusoidalEmbedding2D(height: int, width: int, embedding_dim: int, base: float = 10000.0)
source

Fixed 2-D sinusoidal positional encoding (DETR §A.4 / Carion 2020).

Encodes spatial (row, column) coordinates: first half of the embedding dim encodes the column index, second half the row index. Used in DETR and reusable for any image transformer that wants to inject absolute spatial position without learnable parameters.

Args: height: Feature-map height H. width: Feature-map width W. embedding_dim: Per-position embedding size; must be divisible by 4. base: Frequency base (DETR uses 10000).

Forward: Returns the (H * W, embedding_dim) table in row-major order.

Notes

The table is built once in __init__ and registered as a non-persistent buffer, so it travels with .to(device=...) but is omitted from state_dict. Use the module form when (H, W) is fixed across forward passes (typical for DETR-style decoders fed by a fixed-size CNN feature map); fall back to the functional lucid.nn.functional.sinusoidal_embedding_2d when the spatial grid varies per batch. See that function for the row / column encoding split.

Examples

>>> import lucid.nn as nn
>>> pe = nn.SinusoidalEmbedding2D(height=16, width=24, embedding_dim=128)
>>> pe().shape
(384, 128)

Methods (2)

dunder

__init__

None
__init__(height: int, width: int, embedding_dim: int, base: float = 10000.0)
source
fn

forward

Tensor
forward()
source

Return the precomputed (H * W, embedding_dim) table.