TimestepEmbedding
ModuleTimestepEmbedding(in_dim: int, out_dim: int, base: float = 10000.0)Sinusoidal-frequency embedding of integer timesteps + 2-layer MLP.
Diffusion U-Nets condition every residual block on the current timestep
t. The canonical recipe (Ho et al., 2020 §3.2):
emb(t) = MLP(sinusoidal_embedding(t, dim))
where the sinusoidal part uses the same half-sin / half-cos formula as
SinusoidalEmbedding but is queried per scalar t, not
looked up by position index. Every diffusion model reimplements this
— Lucid centralises it so VAE / DDPM / NCSN share one canonical layer.
Args:
in_dim: Dimension of the raw sinusoidal embedding. Must be even.
out_dim: Dimension of the projected output (the conditioning vector
consumed by U-Net residual blocks). Often 4 * in_dim.
base: Frequency base for the sinusoidal table. Defaults to
10_000 per the original Transformer convention.
Forward:
forward(timesteps) — timesteps is an integer tensor of
arbitrary shape (typically (B,)); returns the projected
embedding of shape (*timesteps.shape, out_dim).
Notes
The output is not a learnable position table — only the two
linear layers of the MLP are trainable. Different timesteps
values produce different conditioning vectors via the deterministic
sinusoidal lookup followed by the learned projection. For DDPM-style
training where each step samples a random t, the layer adds
2 * in_dim * out_dim parameters total.
Examples
>>> import lucid
>>> from lucid.nn import TimestepEmbedding
>>> emb = TimestepEmbedding(in_dim=128, out_dim=512)
>>> t = lucid.tensor([0, 250, 500, 999]) # batch of 4 timesteps
>>> cond = emb(t)
>>> cond.shape
(4, 512)