log_softmax

→Tensor

log_softmax(x: Tensor, dim: int | None = None)

source edit

Implementing kernel

Numerically stable log-softmax along a dimension.

Equivalent to log(softmax(x, dim)) but computed in a way that avoids overflow in the exponentials and the loss of precision incurred by taking the logarithm of small softmax probabilities. Almost always the right thing to use as the input to negative-log-likelihood / NLL classification losses.

Parameters

xTensor

Input logits of any shape.

dimint= None

Dimension along which the log-softmax is computed. Defaults to the last dimension (-1).

Returns

Tensor

Log-probabilities of the same shape as x, summing-to-one on the exponentiated scale along dim.

Notes

\text{log\_softmax}(x)_i = x_i - \log\!\sum_j e^{x_j} = (x_i - m) - \log\!\sum_j e^{x_j - m}, \quad m = \max_j x_j

The $-m$ shift inside the exponent is the key numerical trick: it makes every exponent non-positive so $e^{x_j - m} \in (0, 1]$ and the sum is bounded. Pairing log_softmax with nll_loss is numerically equivalent to — and more stable than — softmax followed by cross_entropy.

Examples

>>> import lucid
>>> from lucid.nn.functional import log_softmax
>>> logits = lucid.tensor([[1.0, 2.0, 3.0]])
>>> log_softmax(logits, dim=1)
Tensor([[-2.4076, -1.4076, -0.4076]])

Used by 4