fn

binary_cross_entropy_with_logits

Tensor
binary_cross_entropy_with_logits(x: Tensor, target: Tensor, weight: Tensor | None = None, pos_weight: Tensor | None = None, reduction: str = 'mean')
source

Binary cross-entropy from raw logits (numerically stable).

Mathematically equivalent to binary_cross_entropy(sigmoid(x), target) but evaluated in a log-sum-exp-style form that avoids overflow/underflow when |x| is large. This is the preferred binary classification loss for training — composing a separate sigmoid with BCE risks catastrophic cancellation in log(1σ(x))\log(1 - \sigma(x)) for large positive logits.

Parameters

xTensor
Raw logits (un-bounded reals), any shape.
targetTensor
Target probabilities (typically binary), same shape as x.
weightTensor or None= None
Element-wise rescaling factor.
pos_weightTensor or None= None
Per-class weight applied to the positive term only — useful for highly-imbalanced binary tasks, where setting pos_weight = n_neg / n_pos recovers the prevalence- balanced gradient.
reductionstr= 'mean'
"mean" (default), "sum", or "none".

Returns

Tensor

Scalar or full-shape per reduction.

Notes

The numerically stable base form is

Li=max(xi,0)xiyi+log ⁣(1+exi),L_i = \max(x_i, 0) - x_i\,y_i + \log\!\big(1 + e^{-|x_i|}\big),

equivalent to (ylogσ(x)+(1y)log(1σ(x)))-(y\log\sigma(x) + (1-y)\log(1-\sigma(x))) but free of overflow. With pos_weight:

Li=(1yi)xi+(1+(w+1)yi)(log(1+exi)+max(xi,0)).L_i = (1 - y_i)\,x_i + \big(1 + (w^{+} - 1) y_i\big) \big(\log(1 + e^{-|x_i|}) + \max(-x_i, 0)\big).

Gradient w.r.t. x is the clean σ(xi)yi\sigma(x_i) - y_i (modulo weighting) — the canonical reason this form is used in practice instead of the explicit sigmoid + BCE composition.

Examples

>>> import lucid
>>> from lucid.nn.functional import binary_cross_entropy_with_logits
>>> logits = lucid.tensor([2.0, -1.0, 0.5])
>>> target = lucid.tensor([1.0, 0.0, 1.0])
>>> binary_cross_entropy_with_logits(logits, target)
Tensor(0.3567...)