class

detect_anomaly

detect_anomaly(check_nan: bool = True)

source edit

Context manager / decorator that enables autograd anomaly detection.

Reverse-mode automatic differentiation propagates a chain of partial derivatives backwards through the computation graph; a single NaN or Inf appearing anywhere in that chain silently contaminates every upstream gradient and is one of the hardest classes of training bugs to localise. detect_anomaly opts the current scope into stricter checking: after each call to backward the resulting gradients are scanned for non-finite values, and an exception is raised at the offending boundary so the failure surfaces at the source rather than at the optimiser step.

Enabling anomaly detection adds overhead — every backward pass performs an additional reduction over the gradient tensors — so it is intended for debugging sessions, not steady-state training.

Parameters

check_nanbool= True

If True (default) every backward pass is checked for NaN / Inf gradients and a RuntimeError is raised on the first violation. Setting it to False enters the context but performs no checking — useful for nested scopes where an outer block already enabled the flag.

Attributes

check_nanbool

The mode argument as passed to __init__.

Notes

Reverse-mode AD computes

\frac{\partial \mathcal{L}}{\partial x} = \sum_{p \in \text{paths}(x \to \mathcal{L})} \prod_{(u, v) \in p} \frac{\partial v}{\partial u}.

A single non-finite Jacobian entry along any path corrupts the sum, so the check is performed on the final accumulated gradient rather than on individual op outputs.

The context manager restores the previous anomaly flag on exit, so nested with blocks behave correctly.

Examples

Use as a context manager during loss computation:
>>> import lucid
>>> from lucid.autograd import detect_anomaly
>>> x = lucid.tensor([1.0, 2.0, 3.0], requires_grad=True)
>>> with detect_anomaly():
...     y = (x * x).sum()
...     y.backward()
Use as a decorator on a training step:
>>> @detect_anomaly()
... def train_step(x, target):
...     loss = ((x - target) ** 2).sum()
...     loss.backward()
...     return loss

Used by 1

lucid.autograd

Constructors

dunder

init

→None

__init__(check_nan: bool = True)

source edit

Initialise the instance. See the class docstring for parameter semantics.

dunder

call

→object

__call__(fn: object)

source edit

Support use as a decorator.

Dunder methods

dunder

enter

→detect_anomaly

__enter__()

source edit

Enter the context. Returns self so the value can be bound via with ... as.

dunder

exit

→None

__exit__(args: object = ())

source edit

Exit the context, restoring any state that was modified on entry.

Use as a context manager during loss computation: >>> import lucid >>> from lucid.autograd import detect_anomaly >>> x = lucid.tensor([1.0, 2.0, 3.0], requires_grad=True) >>> with detect_anomaly(): ... y = (x * x).sum() ... y.backward() Use as a decorator on a training step: >>> @detect_anomaly() ... def train_step(x, target): ... loss = ((x - target) ** 2).sum() ... loss.backward() ... return loss