Distribution
Distribution(batch_shape: tuple[int, ...] = (), event_shape: tuple[int, ...] = (), validate_args: bool | None = None)Abstract base for a probability distribution.
A distribution encodes a probability measure over a measurable space
and exposes a standard interface for sampling, evaluating
log-probabilities, and computing closed-form moments. Every concrete
distribution in lucid.distributions inherits from this class.
Subclasses set:
arg_constraints— dict of param-name →Constraint, used byvalidate_argsand to spell out the parameter domain.support—Constraintfor the random variable.has_rsample—Truefor reparameterisable families.batch_shape,event_shape.
Either rsample or sample (or both) must be overridden.
Parameters
batch_shapetuple[int, ...]= ()(), for a vector parameter of length n
it is (n,). Default is ().event_shapetuple[int, ...]= ()event_shape = (). Multivariate
distributions such as lucid.distributions.Dirichlet
have a non-empty event_shape. Default is ().validate_argsbool or None= NoneTrue, parameter constraints and sample support are
validated at construction time and in log_prob. Useful
during development; disable in production for speed. None
inherits the class-level _validate_args flag. Default is
None.Attributes
arg_constraintsdict[str, Constraint]lucid.distributions.constraints.Constraint it must
satisfy. Populated by each concrete subclass.supportConstraint or NoneNone means
unconstrained.has_rsampleboolTrue when the distribution implements
rsample — i.e., when the reparameterisation trick
(Kingma & Welling, 2013) is available and gradients flow through
sampled values.has_enumerate_supportboolTrue for finite discrete distributions that can enumerate
every possible outcome, enabling exact marginalisation.Notes
Shape semantics
Every tensor returned by sample or rsample has shape
where + denotes tuple concatenation. log_prob returns a
tensor of shape sample_shape + batch_shape, having reduced over
event_shape.
Reparameterisation
When has_rsample = True the sampler can be written as a
deterministic transformation of a fixed-distribution noise variable
:
This allows gradients to be estimated with low variance via the pathwise derivative, which is the backbone of the VAE objective (ELBO) and stochastic computation graphs in general.
Examples
>>> import lucid.distributions as dist
>>> d = dist.Normal(loc=0.0, scale=1.0)
>>> d.batch_shape
()
>>> d.event_shape
()
>>> x = d.rsample((100,)) # shape (100,)
>>> x.shape
(100,)Methods (15)
__init__
→None__init__(batch_shape: tuple[int, ...] = (), event_shape: tuple[int, ...] = (), validate_args: bool | None = None)Initialise batch/event shapes and optionally validate parameters.
Parameters
batch_shapetuple[int, ...]= ()event_shapetuple[int, ...]= ()validate_argsbool or None= NoneTrue, _validate_params is called immediately so
that out-of-constraint constructor arguments raise
ValueError at construction time rather than silently
producing NaN values later.batch_shape
→tuple[int, ...]batch_shape: tuple[int, ...]Shape of the batch of independent (but not identically parameterised) distributions.
Returns
tuple[int, ...]A tuple of integers. () for a single scalar distribution.
event_shape
→tuple[int, ...]event_shape: tuple[int, ...]Shape of a single observation drawn from the distribution.
Returns
tuple[int, ...]() for univariate distributions. Non-empty for
multivariate distributions such as
lucid.distributions.Dirichlet.
mean
→Tensormean: TensorExpected value of the distribution.
Returns
TensorA tensor of shape batch_shape. Raises
NotImplementedError if the distribution has no
closed-form mean (e.g. lucid.distributions.Cauchy).
mode
→Tensormode: TensorMost likely value of the distribution (the argmax of the density).
Returns
TensorA tensor of shape batch_shape. Raises
NotImplementedError if not implemented by the
concrete subclass.
variance
→Tensorvariance: TensorVariance of the distribution.
Returns
TensorA tensor of shape batch_shape. The variance is the
second central moment, .
Raises NotImplementedError if not provided by the
concrete subclass.
stddev
→Tensorstddev: TensorStandard deviation of the distribution.
Computed as . Concrete
subclasses may override this for numerical efficiency; the default
implementation delegates to variance.
Returns
TensorA tensor of shape batch_shape.
entropy
→Tensorentropy()Shannon differential (or discrete) entropy.
Defined as
for a continuous distribution and similarly for discrete ones. Measured in nats (natural logarithm base).
Returns
TensorA tensor of shape batch_shape. Raises
NotImplementedError if not implemented by the
concrete subclass.
Notes
Entropy quantifies the average amount of "surprise" or uncertainty in a single draw. Higher entropy means the distribution is more spread out / less predictable.
sample
→Tensorsample(sample_shape: tuple[int, ...] = ())Draw independent, identically distributed samples.
The default implementation calls rsample and detaches the
result from the autograd graph, so gradients do not flow through
the returned tensor. Discrete distributions that cannot be
reparameterised must override this method directly instead.
Parameters
sample_shapetuple[int, ...]= ()sample_shape + batch_shape + event_shape.
Default is (), which returns a single sample with shape
batch_shape + event_shape.Returns
TensorA detached tensor (no gradient) of shape
sample_shape + batch_shape + event_shape.
rsample
→Tensorrsample(sample_shape: tuple[int, ...] = ())Reparameterised sample — gradients flow through the noise.
Unlike sample, rsample expresses the stochastic node as
a deterministic transformation of a parameter-free noise variable:
This factorisation allows the gradient to be estimated cheaply via the pathwise (re-parameterisation) estimator, which has much lower variance than the REINFORCE estimator.
Concrete distributions must override either this or sample.
Only distributions that admit a differentiable sampler set
has_rsample = True.
Parameters
sample_shapetuple[int, ...]= ()sample_shape + batch_shape + event_shape.Returns
TensorA tensor attached to the autograd graph through the distribution parameters.
Raises
NotImplementedErrorhas_rsample = False). Use sample
instead in that case.log_prob
→Tensorlog_prob(value: Tensor)Log-probability (log-density) evaluated at value.
For a continuous distribution with density this returns . For a discrete distribution with probability mass function this returns .
Working in log-space is numerically preferable to evaluating the density directly: products of probabilities become sums of log-probabilities, avoiding underflow for long sequences.
Parameters
valueTensorbatch_shape + event_shape.Returns
TensorLog-probability tensor of shape
broadcast(value.shape, batch_shape + event_shape)[:-len(event_shape)].
In the scalar / univariate case this simplifies to
broadcast(value.shape, batch_shape).
Raises
NotImplementedErrorcdf
→Tensorcdf(value: Tensor)Cumulative distribution function (CDF) evaluated at value.
Returns the probability that a random variable drawn
from this distribution is less than or equal to value:
Parameters
valueTensorReturns
TensorValues in with the same shape as
broadcast(value, batch_shape).
Raises
NotImplementedErroricdf
→Tensoricdf(value: Tensor)Inverse CDF (quantile function / percent-point function).
Given a probability returns the smallest such that :
The quantile function is particularly useful for inverse-CDF (Smirnov transform) sampling: if then .
Parameters
valueTensorReturns
TensorQuantiles with the same shape as
broadcast(value, batch_shape).
Raises
NotImplementedErrorprob
→Tensorprob(value: Tensor)Probability density (or mass) at value.
Computed as via log_prob. For
numerical stability prefer working with log_prob directly;
use this method only when the raw density value is needed.
Parameters
valueTensorReturns
TensorNon-negative density or probability-mass values.
__repr__
→str__repr__()Concise string representation showing parameter shapes.
Returns
strE.g. "Normal(loc=(3,), scale=(3,))" or
"Bernoulli(probs=0.3)" for a scalar.