class

TimestepEmbedding

extendsModule
TimestepEmbedding(in_dim: int, out_dim: int, base: float = 10000.0)
source

Sinusoidal-frequency embedding of integer timesteps + 2-layer MLP.

Diffusion U-Nets condition every residual block on the current timestep t. The canonical recipe (Ho et al., 2020 §3.2):

emb(t) = MLP(sinusoidal_embedding(t, dim))

where the sinusoidal part uses the same half-sin / half-cos formula as SinusoidalEmbedding but is queried per scalar t, not looked up by position index. Every diffusion model reimplements this — Lucid centralises it so VAE / DDPM / NCSN share one canonical layer.

Args: in_dim: Dimension of the raw sinusoidal embedding. Must be even. out_dim: Dimension of the projected output (the conditioning vector consumed by U-Net residual blocks). Often 4 * in_dim. base: Frequency base for the sinusoidal table. Defaults to 10_000 per the original Transformer convention.

Forward: forward(timesteps)timesteps is an integer tensor of arbitrary shape (typically (B,)); returns the projected embedding of shape (*timesteps.shape, out_dim).

Notes

The output is not a learnable position table — only the two linear layers of the MLP are trainable. Different timesteps values produce different conditioning vectors via the deterministic sinusoidal lookup followed by the learned projection. For DDPM-style training where each step samples a random t, the layer adds 2 * in_dim * out_dim parameters total.

Examples

>>> import lucid
>>> from lucid.nn import TimestepEmbedding
>>> emb = TimestepEmbedding(in_dim=128, out_dim=512)
>>> t = lucid.tensor([0, 250, 500, 999])         # batch of 4 timesteps
>>> cond = emb(t)
>>> cond.shape
(4, 512)

Methods (2)

dunder

__init__

None
__init__(in_dim: int, out_dim: int, base: float = 10000.0)
source
fn

forward

Tensor
forward(timesteps: Tensor)
source

Project timesteps into an (*timesteps.shape, out_dim) conditioning vector.