fn

hessian

Tensor or tuple of tuple of Tensor
hessian(func: Callable[..., Tensor], inputs: Tensor | tuple[Tensor, ...], create_graph: bool = False, strict: bool = False, vectorize: bool = False)
source

Compute the Hessian matrix of a scalar-valued func.

The Hessian of a scalar function f:RnRf : \mathbb{R}^n \to \mathbb{R} is

Hij=2fxixj,HRn×n.H_{ij} = \frac{\partial^2 f}{\partial x_i \, \partial x_j}, \qquad H \in \mathbb{R}^{n \times n}.

Implemented as jacobian of the gradient of func — a forward pass produces the loss, a first backward (with create_graph=True) builds the gradient graph, and a second backward along each gradient coordinate yields the rows of HH. Cost is therefore O(ncost(f))O(n \cdot \text{cost}(\nabla f)).

Parameters

funccallable
Scalar-valued function of one or more Tensor inputs.
inputsTensor or tuple of Tensor
Inputs at which HH is evaluated. They are silently promoted to requires_grad=True if necessary.
create_graphbool= False
If True the Hessian itself remains differentiable (third-order derivatives). Defaults to False.
strictbool= False
Reserved for stricter validation. Currently unused.
vectorizebool= False
Reserved for a future vmap-based implementation. Currently unused.

Returns

Tensor or tuple of tuple of Tensor

For a single input the returned tensor has shape (numel(x), numel(x)). For multiple inputs a nested tuple of cross-Hessian blocks is returned, with H[i][j] containing 2f/(xixj)\partial^2 f / (\partial x_i \, \partial x_j).

Notes

Symmetry Hij=HjiH_{ij} = H_{ji} holds in exact arithmetic when ff is C2C^2. In floating-point the result is only approximately symmetric; symmetrize as 12(H+H)\tfrac{1}{2}(H + H^\top) if a strictly symmetric matrix is required.

Examples

>>> import lucid
>>> from lucid.autograd import hessian
>>> x = lucid.tensor([1.0, 2.0])
>>> def f(x):
...     return (x * x).sum()
>>> H = hessian(f, x)
>>> H.shape
(2, 2)