embedding

→Tensor

embedding(x: Tensor, weight: Tensor, padding_idx: int | None = None, max_norm: float | None = None, norm_type: float = 2.0, scale_grad_by_freq: bool = False, sparse: bool = False)

source edit

Implementing kernel

Look up rows of an embedding table by integer indices.

A learned embedding table maps integer tokens / categorical features into dense vectors:

\mathrm{out}[i_1, \dots, i_k] = W[\, x[i_1, \dots, i_k]\, ]

where W of shape (num_embeddings, embedding_dim) is the lookup table and x holds integer indices in [0, num_embeddings). Equivalent to a one-hot matmul $\mathrm{onehot}(x) W$ but computed with an indexed gather.

Parameters

xTensor

Integer index tensor of arbitrary shape (*).

weightTensor

Embedding table of shape (num_embeddings, embedding_dim).

padding_idxint= None

If given, the embedding vector at weight[padding_idx] is treated as a padding slot: its gradient is forced to zero so the padding embedding stays at its initialised value (typically a zero vector) throughout training.

max_normfloat= None

If given, every entry of weight whose

L_p

norm exceeds max_norm is renormalised in-place to have norm max_norm prior to the lookup (with

p

= norm_type).

norm_typefloat= 2.0

The

p

value of the

L_p

norm used by max_norm. Default 2.0.

scale_grad_by_freqbool= False

If True, scale gradients of each embedding row by the inverse of its frequency in the mini-batch — useful for highly skewed token distributions.

sparsebool= False

Request a sparse gradient w.r.t. weight. Lucid currently always produces a dense gradient; this flag is accepted for API compatibility.

Returns

Tensor

Embedded tensor of shape (*, embedding_dim).

Notes

The backward pass for embedding accumulates gradient contributions from repeated indices via scatter-add — multiple tokens of the same type in a batch correctly sum into the same row of $\partial L / \partial W$ .

Examples

>>> import lucid
>>> from lucid.nn.functional import embedding
>>> table = lucid.randn(10, 4)              # 10 tokens, dim 4
>>> ids = lucid.tensor([[1, 2, 4], [4, 3, 2]], dtype=lucid.int64)
>>> out = embedding(ids, table)
>>> out.shape
(2, 3, 4)

Used by 2

>>> import lucid >>> from lucid.nn.functional import embedding >>> table = lucid.randn(10, 4) # 10 tokens, dim 4 >>> ids = lucid.tensor([[1, 2, 4], [4, 3, 2]], dtype=lucid.int64) >>> out = embedding(ids, table) >>> out.shape (2, 3, 4)