class

SinusoidalEmbedding2D

extendsModule

SinusoidalEmbedding2D(height: int, width: int, embedding_dim: int, base: float = 10000.0)

source edit

Fixed 2-D sinusoidal positional encoding (DETR §A.4 / Carion 2020).

Encodes spatial (row, column) coordinates: first half of the embedding dim encodes the column index, second half the row index. Used in DETR and reusable for any image transformer that wants to inject absolute spatial position without learnable parameters. forward() returns the (H * W, embedding_dim) table in row-major order.

Parameters

heightint

Feature-map height

H

widthint

Feature-map width

W

embedding_dimint

Per-position embedding size; must be divisible by 4 (half encodes column, half encodes row, each split into sin/cos).

basefloat= 10000.0

Frequency base. DETR uses 10000. Default 10000.0.

Notes

The table is built once in __init__ and registered as a non-persistent buffer, so it travels with .to(device=...) but is omitted from state_dict. Use the module form when (H, W) is fixed across forward passes (typical for DETR-style decoders fed by a fixed-size CNN feature map); fall back to the functional lucid.nn.functional.sinusoidal_embedding_2d when the spatial grid varies per batch. See that function for the row / column encoding split.

Examples

>>> import lucid.nn as nn
>>> pe = nn.SinusoidalEmbedding2D(height=16, width=24, embedding_dim=128)
>>> pe().shape
(384, 128)

Used by 1

lucid.nn.modules

Constructors

dunder

init

→None

__init__(height: int, width: int, embedding_dim: int, base: float = 10000.0)

source edit

Instance methods

forward

→Tensor

forward()

source edit

Return the precomputed (H * W, embedding_dim) table.

class

SinusoidalEmbedding2D

extendsModule

SinusoidalEmbedding2D(height: int, width: int, embedding_dim: int, base: float = 10000.0)

source edit

Fixed 2-D sinusoidal positional encoding (DETR §A.4 / Carion 2020).

Parameters

heightint

Feature-map height

H

widthint

Feature-map width

W

embedding_dimint

Per-position embedding size; must be divisible by 4 (half encodes column, half encodes row, each split into sin/cos).

basefloat= 10000.0

Frequency base. DETR uses 10000. Default 10000.0.

Notes

Examples

>>> import lucid.nn as nn
>>> pe = nn.SinusoidalEmbedding2D(height=16, width=24, embedding_dim=128)
>>> pe().shape
(384, 128)

Used by 1

lucid.nn.modules

Constructors

dunder

init

→None

__init__(height: int, width: int, embedding_dim: int, base: float = 10000.0)

source edit

Instance methods

forward

→Tensor

forward()

source edit

Return the precomputed (H * W, embedding_dim) table.