fn

embedding

Tensor
embedding(x: Tensor, weight: Tensor, padding_idx: int | None = None, max_norm: float | None = None, norm_type: float = 2.0, scale_grad_by_freq: bool = False, sparse: bool = False)
source

Look up rows of an embedding table by integer indices.

A learned embedding table maps integer tokens / categorical features into dense vectors:

out[i1,,ik]=W[x[i1,,ik]]\mathrm{out}[i_1, \dots, i_k] = W[\, x[i_1, \dots, i_k]\, ]

where W of shape (num_embeddings, embedding_dim) is the lookup table and x holds integer indices in [0, num_embeddings). Equivalent to a one-hot matmul onehot(x)W\mathrm{onehot}(x) W but computed with an indexed gather.

Parameters

xTensor
Integer index tensor of arbitrary shape (*).
weightTensor
Embedding table of shape (num_embeddings, embedding_dim).
padding_idxint= None
If given, the embedding vector at weight[padding_idx] is treated as a padding slot: its gradient is forced to zero so the padding embedding stays at its initialised value (typically a zero vector) throughout training.
max_normfloat= None
If given, every entry of weight whose LpL_p norm exceeds max_norm is renormalised in-place to have norm max_norm prior to the lookup (with pp = norm_type).
norm_typefloat= 2.0
The pp value of the LpL_p norm used by max_norm. Default 2.0.
scale_grad_by_freqbool= False
If True, scale gradients of each embedding row by the inverse of its frequency in the mini-batch — useful for highly skewed token distributions.
sparsebool= False
Request a sparse gradient w.r.t. weight. Lucid currently always produces a dense gradient; this flag is accepted for API compatibility.

Returns

Tensor

Embedded tensor of shape (*, embedding_dim).

Notes

The backward pass for embedding accumulates gradient contributions from repeated indices via scatter-add — multiple tokens of the same type in a batch correctly sum into the same row of L/W\partial L / \partial W.

Examples

>>> import lucid
>>> from lucid.nn.functional import embedding
>>> table = lucid.randn(10, 4)              # 10 tokens, dim 4
>>> ids = lucid.tensor([[1, 2, 4], [4, 3, 2]], dtype=lucid.int64)
>>> out = embedding(ids, table)
>>> out.shape
(2, 3, 4)