class

NLLLoss

extendsModule
NLLLoss(weight: Tensor | None = None, ignore_index: int = -100, reduction: str = 'mean')
source

Negative log-likelihood loss.

Operates on log-probabilities — the input is expected to already be the output of a LogSoftmax layer. For sample n with true class label y_n the per-sample loss is:

n=xn,yn\ell_n = -x_{n,\,y_n}

and the final scalar is obtained via the chosen reduction.

This is the second stage of the two-step formulation of multi-class cross-entropy: LogSoftmax followed by NLLLoss. When a single fused module is preferred, use CrossEntropyLoss instead.

Parameters

weightTensor of shape (C,)= None
Manual rescaling weight for each class. Useful for imbalanced class distributions.
ignore_indexint= -100
Target value that is ignored and does not contribute to the gradient. Default -100.
reductionstr= 'mean'
'none' | 'mean' (default) | 'sum'.

Attributes

weightTensor or None
Per-class weight tensor.
ignore_indexint
Excluded target index.
reductionstr
The reduction mode.

Notes

  • Input x : (N,C)(N, C) or (N,C,d1,)(N, C, d_1, \ldots) — log-probabilities (e.g. output of LogSoftmax).
  • Target y : (N,)(N,) or (N,d1,)(N, d_1, \ldots) — integer class indices.
  • Output : scalar for 'mean' / 'sum'; (N,)(N,) for 'none'.
  • The input must contain log-probabilities. Feeding raw logits or softmax probabilities produces incorrect results.
  • Typical usage: log_p = lucid.nn.functional.log_softmax(logits, dim=1), then loss = NLLLoss()(log_p, targets).

Examples

Manual log-softmax then NLLLoss:
>>> import lucid
>>> import lucid.nn as nn
>>> import lucid.nn.functional as F
>>> log_softmax = nn.LogSoftmax(dim=1)
>>> criterion  = nn.NLLLoss()
>>> logits = lucid.tensor([[0.5, 1.5, 0.3], [1.2, 0.1, 0.9]])
>>> log_probs = log_softmax(logits)
>>> targets   = lucid.tensor([1, 0])
>>> loss = criterion(log_probs, targets)
With a custom per-class weight:
>>> import lucid
>>> import lucid.nn as nn
>>> criterion = nn.NLLLoss(weight=lucid.tensor([1.0, 3.0, 1.0]))
>>> log_probs = lucid.tensor([[-1.2, -0.5, -2.1], [-0.8, -1.5, -0.3]])
>>> targets   = lucid.tensor([1, 2])
>>> loss = criterion(log_probs, targets)

Methods (3)

dunder

__init__

None
__init__(weight: Tensor | None = None, ignore_index: int = -100, reduction: str = 'mean')
source

Initialise the NLLLoss module. See the class docstring for parameter semantics.

fn

forward

Tensor
forward(x: Tensor, target: Tensor)
source

Compute the loss between predictions and targets.

Parameters

xTensor
Input tensor.
targetTensor
Input tensor.

Returns

Tensor

Scalar loss (or unreduced tensor depending on reduction).

fn

extra_repr

str
extra_repr()
source

Return a string representation of the layer's configuration.