CrossEntropyLoss
ModuleCrossEntropyLoss(weight: Tensor | None = None, ignore_index: int = -100, reduction: str = 'mean', label_smoothing: float = 0.0)Cross-entropy loss for multi-class classification.
This criterion combines a log-softmax and a negative log-likelihood
step in a single numerically stable operation. For a batch of N
samples, each of class index y_n from C classes, and raw
logit vector x_n:
The log-sum-exp trick is applied internally so that large logit values do not cause overflow or underflow.
Label smoothing — when label_smoothing
the hard target is softened to a mixture of the one-hot label and the
uniform distribution:
which replaces the loss with:
Parameters
weightTensor of shape (C,)= NoneC.ignore_indexint= -100-100.reductionstr= 'mean''none' | 'mean' (default) | 'sum'.label_smoothingfloat= 0.00.0
(no smoothing).Attributes
weightTensor or NoneNone if not provided.ignore_indexintreductionstrlabel_smoothingfloatNotes
- Input
x: or — raw unnormalised logits. - Target
y: or — integer class indices in . - Output : scalar when
reductionis'mean'or'sum'; or for'none'.
- Passing logits rather than softmax probabilities is strongly recommended for numerical stability — the internal log-sum-exp implementation avoids catastrophic cancellation.
- Equivalent to
NLLLoss(LogSoftmax(x, dim=1), y)but computed in a single pass.
Examples
Three-class classification with a batch of two samples:
>>> import lucid
>>> import lucid.nn as nn
>>> criterion = nn.CrossEntropyLoss()
>>> x = lucid.tensor([[0.1, 0.9, 0.0], [2.0, 0.5, 0.1]])
>>> y = lucid.tensor([1, 0])
>>> loss = criterion(x, y) # scalar
With label smoothing and per-class weights:
>>> import lucid
>>> import lucid.nn as nn
>>> criterion = nn.CrossEntropyLoss(
... weight=lucid.tensor([1.0, 2.0, 1.0]),
... label_smoothing=0.1,
... )
>>> x = lucid.tensor([[1.0, 2.0, 0.5], [0.2, 0.8, 1.5]])
>>> y = lucid.tensor([1, 2])
>>> loss = criterion(x, y)Methods (3)
__init__
→None__init__(weight: Tensor | None = None, ignore_index: int = -100, reduction: str = 'mean', label_smoothing: float = 0.0)Initialise the CrossEntropyLoss module. See the class docstring for parameter semantics.
forward
→Tensorforward(x: Tensor, target: Tensor)Compute the loss between predictions and targets.
Parameters
xTensortargetTensorReturns
TensorScalar loss (or unreduced tensor depending on reduction).
extra_repr
→strextra_repr()Return a string representation of the layer's configuration.