fn

nll_loss

Tensor
nll_loss(x: Tensor, target: Tensor, weight: Tensor | None = None, ignore_index: int = -100, reduction: str = 'mean')
source

Negative log-likelihood loss for multi-class classification.

The "back-half" of cross_entropy: assumes the input is already a tensor of log-probabilities (typically produced by lucid.nn.functional.log_softmax). Provided as a separate entry-point so models that need log-probabilities for downstream use (e.g., beam search) can avoid recomputing them.

Parameters

xTensor
Log-probabilities of shape (N,C)(N, C) or (N,C,d1,,dk)(N, C, d_1, \dots, d_k).
targetTensor
Integer class indices of shape (N,)(N,) / (N,d1,,dk)(N, d_1, \dots, d_k).
weightTensor or None= None
Per-class weight vector (C,)(C,).
ignore_indexint= -100
Class index whose samples are excluded (default -100).
reductionstr= 'mean'
"mean" (default), "sum", or "none".

Returns

Tensor

Scalar or per-sample tensor depending on reduction.

Notes

Per-sample loss:

Li=wtixi,tiL_i = -w_{t_i}\,x_{i, t_i}

Under "mean" reduction, the divisor is the sum of effective sample weights — i.e., iwti1[tiignore]\sum_i w_{t_i}\,\mathbb{1}[t_i \ne \text{ignore}] — not the raw count. This matches the standard convention so that the loss is invariant to a global rescaling of weight.

Examples

>>> import lucid
>>> from lucid.nn.functional import nll_loss, log_softmax
>>> logits = lucid.tensor([[2.0, 0.5, 0.1], [0.0, 1.5, 0.2]])
>>> target = lucid.tensor([0, 1])
>>> nll_loss(log_softmax(logits, dim=1), target)
Tensor(0.3490...)