class

NLLLoss

extendsModule

NLLLoss(weight: Tensor | None = None, ignore_index: int = -100, reduction: Reduction = 'mean')

source edit

Implementing kernel

C++NLLLossBackwardclass

Negative log-likelihood loss.

Operates on log-probabilities — the input is expected to already be the output of a LogSoftmax layer. For sample n with true class label y_n the per-sample loss is:

\ell_n = -x_{n,\,y_n}

and the final scalar is obtained via the chosen reduction.

This is the second stage of the two-step formulation of multi-class cross-entropy: LogSoftmax followed by NLLLoss. When a single fused module is preferred, use CrossEntropyLoss instead.

Parameters

weightTensor of shape (C,)= None

Manual rescaling weight for each class. Useful for imbalanced class distributions.

ignore_indexint= -100

Target value that is ignored and does not contribute to the gradient. Default -100.

reductionstr= 'mean'

'none' | 'mean' (default) | 'sum'.

Attributes

weightTensor or None

Per-class weight tensor.

ignore_indexint

Excluded target index.

reductionstr

The reduction mode.

Notes

Input x : $(N, C)$ or $(N, C, d_1, \ldots)$ — log-probabilities (e.g. output of LogSoftmax).
Target y : $(N,)$ or $(N, d_1, \ldots)$ — integer class indices.
Output : scalar for 'mean' / 'sum'; $(N,)$ for 'none'.

The input must contain log-probabilities. Feeding raw logits or softmax probabilities produces incorrect results.
Typical usage: log_p = lucid.nn.functional.log_softmax(logits, dim=1), then loss = NLLLoss()(log_p, targets).

Examples

Manual log-softmax then NLLLoss:
>>> import lucid
>>> import lucid.nn as nn
>>> import lucid.nn.functional as F
>>> log_softmax = nn.LogSoftmax(dim=1)
>>> criterion  = nn.NLLLoss()
>>> logits = lucid.tensor([[0.5, 1.5, 0.3], [1.2, 0.1, 0.9]])
>>> log_probs = log_softmax(logits)
>>> targets   = lucid.tensor([1, 0])
>>> loss = criterion(log_probs, targets)
With a custom per-class weight:
>>> import lucid
>>> import lucid.nn as nn
>>> criterion = nn.NLLLoss(weight=lucid.tensor([1.0, 3.0, 1.0]))
>>> log_probs = lucid.tensor([[-1.2, -0.5, -2.1], [-0.8, -1.5, -0.3]])
>>> targets   = lucid.tensor([1, 2])
>>> loss = criterion(log_probs, targets)

Used by 1

lucid.nn.modules

Constructors

dunder

init

→None

__init__(weight: Tensor | None = None, ignore_index: int = -100, reduction: Reduction = 'mean')

source edit

Initialise the NLLLoss module. See the class docstring for parameter semantics.

Instance methods

extra_repr

→str

extra_repr()

source edit

Return a string representation of the layer's configuration.

forward

→Tensor

forward(x: Tensor, target: Tensor)

source edit

Compute the loss between predictions and targets.

Parameters

xTensor

Input tensor.

targetTensor

Input tensor.

Returns

Tensor

Scalar loss (or unreduced tensor depending on reduction).

Manual log-softmax then NLLLoss: >>> import lucid >>> import lucid.nn as nn >>> import lucid.nn.functional as F >>> log_softmax = nn.LogSoftmax(dim=1) >>> criterion = nn.NLLLoss() >>> logits = lucid.tensor([[0.5, 1.5, 0.3], [1.2, 0.1, 0.9]]) >>> log_probs = log_softmax(logits) >>> targets = lucid.tensor([1, 0]) >>> loss = criterion(log_probs, targets) With a custom per-class weight: >>> import lucid >>> import lucid.nn as nn >>> criterion = nn.NLLLoss(weight=lucid.tensor([1.0, 3.0, 1.0])) >>> log_probs = lucid.tensor([[-1.2, -0.5, -2.1], [-0.8, -1.5, -0.3]]) >>> targets = lucid.tensor([1, 2]) >>> loss = criterion(log_probs, targets)