class

MixtureSameFamily

extendsDistribution
MixtureSameFamily(mixture_distribution: Categorical, component_distribution: Distribution, validate_args: bool | None = None)
source

Finite mixture model where all components share the same distribution family.

MixtureSameFamily combines:

  • a mixture distribution — a Categorical over KK components that assigns mixing weight πk\pi_k to component kk,
  • a component distribution — a single Distribution whose rightmost batch dimension has size KK (one set of parameters per component).

This encodes the generative process:

kCategorical(π1,,πK),Xkpk()k \sim \operatorname{Categorical}(\pi_1, \ldots, \pi_K), \quad X \mid k \sim p_k(\cdot)

Sampling is non-reparameterised because drawing the discrete index kk creates a discontinuous path through the mixture weights. For differentiable training through mixture models consider the ELBO lower bound or use relaxed Categorical samples.

Parameters

mixture_distributionCategorical
A Categorical distribution over KK components. Its batch_shape must be compatible with the leading batch dimensions of component_distribution.
component_distributionDistribution
Any distribution whose rightmost batch dimension equals KK (the number of components). For example, a Normal(loc=..., scale=...) with loc.shape[-1] == K.
validate_argsbool | None= None
If True, validate parameter constraints at construction time.

Attributes

mixture_distributionCategorical
The KK-way categorical mixing weights distribution.
component_distributionDistribution
The per-component distributions batched over KK.

Notes

Log-probability is computed via the log-sum-exp trick to avoid underflow when summing exponentially small terms:

logp(x)=logk=1Kπkpk(x)=logsumexpk[logπk+logpk(x)]\log p(x) = \log \sum_{k=1}^{K} \pi_k p_k(x) = \operatorname{logsumexp}_k \bigl[\log \pi_k + \log p_k(x)\bigr]

Mean (law of total expectation):

E[X]=k=1KπkμkE[X] = \sum_{k=1}^{K} \pi_k \mu_k

Variance (law of total variance):

Var[X]=kπkσk2+kπk(μkE[X])2\operatorname{Var}[X] = \sum_k \pi_k \sigma_k^2 + \sum_k \pi_k (\mu_k - E[X])^2

This decomposes into within-component variance (first term) and between-component variance (second term).

Examples

>>> import lucid
>>> from lucid.distributions import MixtureSameFamily, Categorical, Normal
>>> # 2-component Gaussian mixture
>>> mix = Categorical(probs=lucid.tensor([0.3, 0.7]))
>>> comp = Normal(
...     loc=lucid.tensor([-2.0, 2.0]),
...     scale=lucid.tensor([0.5, 1.0]),
... )
>>> dist = MixtureSameFamily(mix, comp)
>>> samples = dist.sample((200,))
>>> samples.shape
(200,)

Methods (5)

dunder

__init__

None
__init__(mixture_distribution: Categorical, component_distribution: Distribution, validate_args: bool | None = None)
source

Construct a mixture-of-experts distribution.

Parameters

mixture_distributionCategorical
Discrete distribution over the K mixture components. Its batch_shape must match the resulting mixture batch shape, and event_shape must be empty.
component_distributionDistribution
Family of component distributions. The rightmost batch dimension indexes the K components, i.e. component_distribution.batch_shape must end with K matching mixture_distribution._num_events.
validate_argsbool | None= None
If True, validate parameter constraints at construction time.

Raises

ValueError
If mixture_distribution is not Categorical or if the rightmost batch dim of component_distribution does not equal K.
prop

mean

Tensor
mean: Tensor
source

Expected value via the law of total expectation: E[X]=kπkμkE[X] = \sum_k \pi_k \mu_k.

Returns

Tensor

Mean of the mixture, shape (*batch_shape, *event_shape).

prop

variance

Tensor
variance: Tensor
source

Variance via the law of total variance.

Decomposes as within-component variance plus between-component variance:

Var[X]=kπkσk2within+kπk(μkE[X])2between\operatorname{Var}[X] = \underbrace{\sum_k \pi_k \sigma_k^2}_{\text{within}} + \underbrace{\sum_k \pi_k (\mu_k - E[X])^2}_{\text{between}}

Returns

Tensor

Variance of the mixture, shape (*batch_shape, *event_shape).

fn

sample

Tensor
sample(sample_shape: tuple[int, ...] = ())
source

Draw samples from the mixture by ancestral sampling.

The procedure is:

  1. Draw component indices kCategorical(π)k \sim \operatorname{Categorical}(\pi).
  2. Draw one sample per component for the full output shape.
  3. Gather the sample corresponding to the drawn component index.

Parameters

sample_shapetuple[int, ...]= ()
Leading shape of the output sample batch.

Returns

Tensor

Samples of shape (*sample_shape, *batch_shape, *event_shape).

fn

log_prob

Tensor
log_prob(value: Tensor)
source

Log-probability of the mixture evaluated at value.

Uses the numerically stable log-sum-exp identity:

logp(x)=logsumexpk[logπk+logpk(x)]\log p(x) = \operatorname{logsumexp}_k \bigl[\log \pi_k + \log p_k(x)\bigr]

Parameters

valueTensor
Observation of shape (*batch_shape, *event_shape).

Returns

Tensor

Log-density values of shape (*batch_shape,).