hessian

→Tensor or tuple of tuple of Tensor

hessian(func: Callable[..., Tensor], inputs: Tensor | tuple[Tensor, ...], create_graph: bool = False, strict: bool = False, vectorize: bool = False)

source edit

Compute the Hessian matrix of a scalar-valued func.

The Hessian of a scalar function $f : \mathbb{R}^n \to \mathbb{R}$ is

H_{ij} = \frac{\partial^2 f}{\partial x_i \, \partial x_j}, \qquad H \in \mathbb{R}^{n \times n}.

Implemented as jacobian of the gradient of func — a forward pass produces the loss, a first backward (with create_graph=True) builds the gradient graph, and a second backward along each gradient coordinate yields the rows of $H$ . Cost is therefore $O(n \cdot \text{cost}(\nabla f))$ .

Parameters

funccallable

Scalar-valued function of one or more Tensor inputs.

inputsTensor or tuple of Tensor

Inputs at which

H

is evaluated. They are silently promoted to requires_grad=True if necessary.

create_graphbool= False

If True the Hessian itself remains differentiable (third-order derivatives). Defaults to False.

strictbool= False

Reserved for stricter validation. Currently unused.

vectorizebool= False

Reserved for a future vmap-based implementation. Currently unused.

Returns

Tensor or tuple of tuple of Tensor

For a single input the returned tensor has shape (numel(x), numel(x)). For multiple inputs a nested tuple of cross-Hessian blocks is returned, with H[i][j] containing $\partial^2 f / (\partial x_i \, \partial x_j)$ .

Notes

Symmetry $H_{ij} = H_{ji}$ holds in exact arithmetic when $f$ is $C^2$ . In floating-point the result is only approximately symmetric; symmetrize as $\tfrac{1}{2}(H + H^\top)$ if a strictly symmetric matrix is required.

Examples

>>> import lucid
>>> from lucid.autograd import hessian
>>> x = lucid.tensor([1.0, 2.0])
>>> def f(x):
...     return (x * x).sum()
>>> H = hessian(f, x)
>>> H.shape
(2, 2)

Used by 2