nll_loss

→Tensor

nll_loss(x: Tensor, target: Tensor, weight: Tensor | None = None, ignore_index: int = -100, reduction: Reduction = 'mean')

source edit

Implementing kernel

C++nll_loss_opfree fn

Negative log-likelihood loss for multi-class classification.

The "back-half" of cross_entropy: assumes the input is already a tensor of log-probabilities (typically produced by lucid.nn.functional.log_softmax). Provided as a separate entry-point so models that need log-probabilities for downstream use (e.g., beam search) can avoid recomputing them.

Parameters

xTensor

Log-probabilities of shape

(N, C)

(N, C, d_1, \dots, d_k)

targetTensor

Integer class indices of shape

(N,)

(N, d_1, \dots, d_k)

weightTensor or None= None

Per-class weight vector

(C,)

ignore_indexint= -100

Class index whose samples are excluded (default -100).

reductionstr= 'mean'

"mean" (default), "sum", or "none".

Returns

Tensor

Scalar or per-sample tensor depending on reduction.

Notes

Per-sample loss:

L_i = -w_{t_i}\,x_{i, t_i}

Under "mean" reduction, the divisor is the sum of effective sample weights — i.e., $\sum_i w_{t_i}\,\mathbb{1}[t_i \ne \text{ignore}]$ — not the raw count. This matches the standard convention so that the loss is invariant to a global rescaling of weight.

Examples

>>> import lucid
>>> from lucid.nn.functional import nll_loss, log_softmax
>>> logits = lucid.tensor([[2.0, 0.5, 0.1], [0.0, 1.5, 0.2]])
>>> target = lucid.tensor([0, 1])
>>> nll_loss(log_softmax(logits, dim=1), target)
Tensor(0.3490...)

Used by 2

>>> import lucid >>> from lucid.nn.functional import nll_loss, log_softmax >>> logits = lucid.tensor([[2.0, 0.5, 0.1], [0.0, 1.5, 0.2]]) >>> target = lucid.tensor([0, 1]) >>> nll_loss(log_softmax(logits, dim=1), target) Tensor(0.3490...)