multilabel_soft_margin_loss

→Tensor

multilabel_soft_margin_loss(input: Tensor, target: Tensor, weight: Tensor | None = None, reduction: Reduction = 'mean')

source edit

Per-class logistic loss averaged over labels (multi-label BCE).

The standard objective for multi-label classification with independent per-class probabilities: each class gets its own binary logistic regression head, and the total loss is the mean of the per-class binary cross-entropies. Mathematically equivalent to applying binary_cross_entropy_with_logits per class and averaging across the class axis.

Computed via the numerically stable identity $\log \sigma(x) = -\mathrm{softplus}(-x)$ , which avoids overflow / underflow for large |x|.

Parameters

inputTensor

Raw logits of shape

(N, C)

targetTensor

Target probabilities (typically binary) of shape

(N, C)

weightTensor or None= None

Per-class weight broadcast against the per-class loss tensor before averaging.

reductionstr= 'mean'

"mean" (default), "sum", or "none".

Returns

Tensor

Scalar or per-sample tensor of shape $(N,)$ .

Notes

Per-sample loss, averaged across the $C$ classes:

L_i = -\frac{1}{C} \sum_c \Big[ t_{i,c}\,\log \sigma(x_{i,c}) + (1 - t_{i,c})\,\log(1 - \sigma(x_{i,c})) \Big]

Because the per-class predictions are independent (no softmax coupling), the gradient through each class is exactly that of a single binary logistic regression — convenient for highly multi-label problems where the active label set is sparse.

Examples

>>> import lucid
>>> from lucid.nn.functional import multilabel_soft_margin_loss
>>> logits = lucid.tensor([[2.0, -1.0, 0.5]])
>>> target = lucid.tensor([[1.0, 0.0, 1.0]])
>>> multilabel_soft_margin_loss(logits, target)
Tensor(0.3567...)

Used by 2

>>> import lucid >>> from lucid.nn.functional import multilabel_soft_margin_loss >>> logits = lucid.tensor([[2.0, -1.0, 0.5]]) >>> target = lucid.tensor([[1.0, 0.0, 1.0]]) >>> multilabel_soft_margin_loss(logits, target) Tensor(0.3567...)