class

MultivariateNormal

extendsDistribution
MultivariateNormal(loc: Tensor, covariance_matrix: Tensor | None = None, precision_matrix: Tensor | None = None, scale_tril: Tensor | None = None, validate_args: bool | None = None)
source

Multivariate Normal (Gaussian) distribution in RD\mathbb{R}^D.

MultivariateNormal(loc=μ, ...) defines the DD-dimensional Gaussian with mean vector μ\mu and a covariance structure expressed in one of three equivalent forms. Internally all forms are converted to the lower-triangular Cholesky factor LL of Σ\Sigma (i.e. Σ=LL\Sigma = L L^\top), which is the numerically preferred representation for both sampling and log-probability evaluation.

Specify exactly one of:

  • covariance_matrix Σ\Sigma (positive-definite, D×DD \times D),
  • precision_matrix Σ1\Sigma^{-1} (positive-definite, inverted via Cholesky internally), or
  • scale_tril LL (lower-triangular with positive diagonal).

Parameters

locTensor
Mean vector μRD\mu \in \mathbb{R}^D of shape (..., D).
covariance_matrixTensor | None= None
Full covariance matrix Σ\Sigma of shape (..., D, D).
precision_matrixTensor | None= None
Precision matrix Σ1\Sigma^{-1} of shape (..., D, D).
scale_trilTensor | None= None
Lower-triangular Cholesky factor LL of shape (..., D, D) with positive diagonal entries.
validate_argsbool | None= None
If True, validate parameter constraints at construction time.

Attributes

locTensor
Mean vector μ\mu.
scale_trilTensor
Cholesky factor LL (always populated, regardless of which parameterisation was used for construction).

Notes

PDF:

p(x;μ,Σ)=(2π)D/2Σ1/2exp ⁣(12(xμ)Σ1(xμ))p(x; \mu, \Sigma) = (2\pi)^{-D/2} |\Sigma|^{-1/2} \exp\!\left(-\tfrac{1}{2}(x-\mu)^\top \Sigma^{-1} (x-\mu)\right)

Log-PDF (numerically stable via Cholesky, avoiding explicit inversion):

logp(x)=12L1(xμ)2ilogLiiD2log(2π)\log p(x) = -\tfrac{1}{2}\|L^{-1}(x-\mu)\|^2 - \sum_i \log L_{ii} - \tfrac{D}{2}\log(2\pi)

where L1(xμ)2\|L^{-1}(x-\mu)\|^2 is the squared Mahalanobis distance computed via triangular solve (no explicit matrix inversion).

Moments:

  • Mean: E[X]=μE[X] = \mu
  • Mode: μ\mu (Gaussian is unimodal)
  • Marginal variances: diagonal of Σ=LL\Sigma = L L^\top

Entropy:

H[X]=D2(1+log(2π))+ilogLiiH[X] = \tfrac{D}{2}(1 + \log(2\pi)) + \sum_i \log L_{ii}

Reparameterised sampling uses the Cholesky factorisation:

X=μ+Lε,εN(0,ID)X = \mu + L \varepsilon, \quad \varepsilon \sim \mathcal{N}(0, I_D)

Gradients propagate through μ\mu and LL unobstructed.

Examples

>>> import lucid
>>> from lucid.distributions import MultivariateNormal
>>> # 2-d isotropic Gaussian
>>> dist = MultivariateNormal(
...     loc=lucid.zeros(2),
...     covariance_matrix=lucid.eye(2),
... )
>>> samples = dist.rsample((50,))
>>> samples.shape  # (50, 2)
(50, 2)
>>> # Log-prob at the mean (maximum)
>>> dist.log_prob(lucid.zeros(2))

Methods (8)

dunder

__init__

None
__init__(loc: Tensor, covariance_matrix: Tensor | None = None, precision_matrix: Tensor | None = None, scale_tril: Tensor | None = None, validate_args: bool | None = None)
source

Initialise a MultivariateNormal distribution.

Exactly one of covariance_matrix, precision_matrix, or scale_tril must be provided. All parameterisations are internally converted to the lower-triangular Cholesky factor L via lucid.linalg.cholesky or triangular inversion.

Parameters

locTensor
Mean vector μRD\mu \in \mathbb{R}^D of shape (..., D).
covariance_matrixTensor | None= None
Full covariance matrix Σ\Sigma of shape (..., D, D).
precision_matrixTensor | None= None
Precision matrix Σ1\Sigma^{-1} of shape (..., D, D).
scale_trilTensor | None= None
Lower-triangular Cholesky factor LL of shape (..., D, D) with positive diagonal.
validate_argsbool | None= None
If True, validate parameter constraints at construction time.

Raises

ValueError
If not exactly one of the covariance parameterisations is given.
prop

covariance_matrix

Tensor
covariance_matrix: Tensor
source

Full covariance matrix Σ=LL\Sigma = L L^\top.

Returns

Tensor

Positive-definite covariance matrix of shape (*batch_shape, D, D).

prop

mean

Tensor
mean: Tensor
source

Mean of the MultivariateNormal: E[X]=μE[X] = \mu.

Returns

Tensor

Mean vector of shape (*batch_shape, D).

prop

mode

Tensor
mode: Tensor
source

Mode of the MultivariateNormal: equal to the mean μ\mu.

The Gaussian is unimodal; its unique maximum is at the mean.

Returns

Tensor

Mode vector of shape (*batch_shape, D).

prop

variance

Tensor
variance: Tensor
source

Marginal variances — diagonal entries of Σ=LL\Sigma = LL^\top.

Returns

Tensor

Variance vector of shape (*batch_shape, D).

fn

rsample

Tensor
rsample(sample_shape: tuple[int, ...] = ())
source

Draw reparameterised samples via the Cholesky factorisation.

X=μ+Lε,εN(0,ID)X = \mu + L \varepsilon, \quad \varepsilon \sim \mathcal{N}(0, I_D)

Gradients propagate through both μ\mu and LL.

Parameters

sample_shapetuple[int, ...]= ()
Leading shape of the output sample batch.

Returns

Tensor

Samples of shape (*sample_shape, *batch_shape, D).

fn

log_prob

Tensor
log_prob(value: Tensor)
source

Log-probability density of the MultivariateNormal distribution.

Computed via the Cholesky factor to avoid explicit matrix inversion:

logp(x)=12L1(xμ)2ilogLiiD2log(2π)\log p(x) = -\tfrac{1}{2}\|L^{-1}(x-\mu)\|^2 - \sum_i \log L_{ii} - \tfrac{D}{2}\log(2\pi)

Parameters

valueTensor
Observation vectors of shape (*batch_shape, D).

Returns

Tensor

Log-density values of shape batch_shape.

fn

entropy

Tensor
entropy()
source

Entropy of the MultivariateNormal distribution.

H[X]=D2(1+log(2π))+ilogLiiH[X] = \tfrac{D}{2}(1 + \log(2\pi)) + \sum_i \log L_{ii}

Returns

Tensor

Entropy values of shape batch_shape (nats).