class

BCEWithLogitsLoss

extendsModule

BCEWithLogitsLoss(weight: Tensor | None = None, reduction: Reduction = 'mean', pos_weight: Tensor | None = None)

source edit

Binary cross-entropy loss that accepts raw logits.

Combines a Sigmoid activation with a binary cross-entropy loss in a single numerically stable expression. Using the identity:

\log(1 + e^x) = \max(x, 0) + \log\!\bigl(1 + e^{-|x|}\bigr)

the per-element loss is computed as:

\ell(x, y) = \max(x, 0) - x\,y + \log\!\bigl(1 + e^{-|x|}\bigr)

When a positive-class weight $p$ is supplied (pos_weight), the loss becomes:

\ell(x, y) = -p \cdot y \log \sigma(x) - (1 - y) \log(1 - \sigma(x))

where $\sigma$ is the sigmoid function. A value of `pos_weight

1` up-weights the positive class, useful when positives are rare.

Parameters

weightTensor= None

Element-wise weight tensor broadcast over input and target.

reductionstr= 'mean'

'none' | 'mean' (default) | 'sum'.

pos_weightTensor= None

Weight for the positive class, shape (C,) or scalar. Provides class-level rebalancing independent of weight.

Attributes

weightTensor or None

Element-wise weight.

reductionstr

The reduction mode.

pos_weightTensor or None

Positive-class weight.

Notes

Input x : $(*)$ — raw logits (any real value).
Target y : $(*)$ — binary labels in $\{0, 1\}$ .
Output : scalar for 'mean' / 'sum'; $(*)$ for 'none'.

Numerically superior to BCELoss(Sigmoid(x), y) because the log-domain computation avoids squashing gradients to zero near saturation.
pos_weight values significantly larger than 1 can destabilise training; values in [1, 10] are typically safe.

Examples

Raw logits, no weighting:
>>> import lucid
>>> import lucid.nn as nn
>>> criterion = nn.BCEWithLogitsLoss()
>>> logits  = lucid.tensor([ 2.0, -1.0, 0.5, -3.0])
>>> targets = lucid.tensor([ 1.0,  0.0, 1.0,  0.0])
>>> loss = criterion(logits, targets)
Up-weighting the positive class by a factor of 10:
>>> import lucid
>>> import lucid.nn as nn
>>> criterion = nn.BCEWithLogitsLoss(
...     pos_weight=lucid.tensor([10.0])
... )
>>> logits  = lucid.tensor([0.2, -0.8, 1.5])
>>> targets = lucid.tensor([1.0,  0.0, 1.0])
>>> loss = criterion(logits, targets)

Used by 1

lucid.nn.modules

Constructors

dunder

init

→None

__init__(weight: Tensor | None = None, reduction: Reduction = 'mean', pos_weight: Tensor | None = None)

source edit

Initialise the BCEWithLogitsLoss module. See the class docstring for parameter semantics.

Instance methods

extra_repr

→str

extra_repr()

source edit

Return a string representation of the layer's configuration.

forward

→Tensor

forward(x: Tensor, target: Tensor)

source edit

Compute the loss between predictions and targets.

Parameters

xTensor

Input tensor.

targetTensor

Input tensor.

Returns

Tensor

Scalar loss (or unreduced tensor depending on reduction).

Raw logits, no weighting: >>> import lucid >>> import lucid.nn as nn >>> criterion = nn.BCEWithLogitsLoss() >>> logits = lucid.tensor([ 2.0, -1.0, 0.5, -3.0]) >>> targets = lucid.tensor([ 1.0, 0.0, 1.0, 0.0]) >>> loss = criterion(logits, targets) Up-weighting the positive class by a factor of 10: >>> import lucid >>> import lucid.nn as nn >>> criterion = nn.BCEWithLogitsLoss( ... pos_weight=lucid.tensor([10.0]) ... ) >>> logits = lucid.tensor([0.2, -0.8, 1.5]) >>> targets = lucid.tensor([1.0, 0.0, 1.0]) >>> loss = criterion(logits, targets)