vjp

→tuple of (Tensor, tuple of (Tensor or None))

vjp(func: Callable[..., Tensor], inputs: Tensor | tuple[Tensor, ...], v: Tensor | tuple[Tensor, ...], create_graph: bool = False, strict: bool = False)

source edit

Vector-Jacobian product $v^\top J$ (reverse-mode AD).

Given $f : \mathbb{R}^n \to \mathbb{R}^m$ with Jacobian $J \in \mathbb{R}^{m \times n}$ and a cotangent vector $v \in \mathbb{R}^m$ , returns

v^\top J \in \mathbb{R}^{n}

along with the primal output $y = f(x)$ . This is the operation that backpropagation performs on every node: when a scalar loss $\mathcal{L}(y)$ is being differentiated against an intermediate $y$ , the upstream cotangent is $v = \partial \mathcal{L} / \partial y$ and the result is $\partial \mathcal{L} / \partial x$ .

Computing a full VJP costs the same as one backward pass — much cheaper than materialising $J$ when only the product is needed.

Parameters

funccallable

Function mapping Tensor inputs to a Tensor (or tuple thereof).

inputsTensor or tuple of Tensor

Primal point

x

at which

J

is evaluated. Silently promoted to requires_grad=True if needed.

vTensor or tuple of Tensor

Cotangent vector(s) matching the output shape(s) of func. Scalar-valued v is broadcast for scalar outputs.

create_graphbool= False

If True the returned VJP is itself differentiable, enabling double-backward. Defaults to False.

strictbool= False

Reserved for stricter validation. Currently unused.

Returns

tuple of (Tensor, tuple of (Tensor or None))

(output, vjp_grads) where output = func(*inputs) and vjp_grads[i] is $v^\top J$ projected onto input i (or None if that input has no gradient path).

Notes

The dual to vjp is jvp, which computes $J v$ via forward-mode (or finite differences in Lucid's current implementation).

Examples

>>> import lucid
>>> from lucid.autograd import vjp
>>> x = lucid.tensor([1.0, 2.0, 3.0])
>>> v = lucid.tensor([1.0, 1.0, 1.0])
>>> def f(x):
...     return x * x
>>> y, (grad_x,) = vjp(f, x, v)

Used by 1

lucid.autograd

vjp