class

StudentT

extendsDistribution

StudentT(df: Tensor | float, loc: Tensor | float = 0.0, scale: Tensor | float = 1.0, validate_args: bool | None = None)

source edit

Student's t-distribution with location, scale, and degrees of freedom.

StudentT(df=ν, loc=μ, scale=σ) defines the three-parameter location-scale generalisation of Student's t. It arises naturally in:

Bayesian inference: the posterior predictive for a Normal likelihood with unknown mean and variance (Normal-InverseGamma conjugate model).
Robust regression: as a heavy-tailed alternative to the Normal for outlier-tolerant models.
Limit behaviour: as $\nu \to \infty$ , the t-distribution converges to $\mathcal{N}(\mu, \sigma^2)$ .

Parameters

dfTensor | float

Degrees of freedom

\nu > 0

. Controls tail heaviness.

\nu = 1

is the Cauchy distribution;

\nu = \infty

is the Normal.

locTensor | float= 0.0

Location parameter

\mu \in \mathbb{R}

. Default is 0.0.

scaleTensor | float= 1.0

Scale parameter

\sigma > 0

. Default is 1.0.

validate_argsbool | None= None

If True, validate parameter constraints at construction time.

Attributes

dfTensor

Degrees of freedom

\nu

locTensor

Location parameter

\mu

scaleTensor

Scale parameter

\sigma

Notes

PDF:

p(x; \nu, \mu, \sigma) = \frac{\Gamma\!\left(\frac{\nu+1}{2}\right)} {\Gamma\!\left(\frac{\nu}{2}\right) \sqrt{\pi \nu}\, \sigma} \left(1 + \frac{(x-\mu)^2}{\nu \sigma^2}\right)^{-(\nu+1)/2}

Log-PDF:

\log p(x) = \log\Gamma\!\left(\frac{\nu+1}{2}\right) - \log\Gamma\!\left(\frac{\nu}{2}\right) - \tfrac{1}{2}\log(\pi\nu) - \log\sigma - \frac{\nu+1}{2}\log\!\left(1 + \frac{z^2}{\nu}\right)

where $z = (x - \mu)/\sigma$ .

Moments:

Mean ( $\nu > 1$ ): $E[X] = \mu$
Variance ( $\nu > 2$ ): $\operatorname{Var}[X] = \sigma^2 \nu / (\nu - 2)$
The distribution has no finite variance for $\nu \leq 2$ and no finite mean for $\nu \leq 1$ .

Reparameterised sampling uses the representation

T = \mu + \sigma \cdot z \cdot \sqrt{\nu / g}

where $z \sim \mathcal{N}(0,1)$ is the differentiable variate and $g \sim \chi^2(\nu)$ is detached (the Gamma sampler uses rejection sampling whose path is not differentiable w.r.t. $\nu$ ).

Examples

>>> import lucid
>>> from lucid.distributions import StudentT
>>> # Heavy-tailed (Cauchy)
>>> cauchy = StudentT(df=1.0, loc=0.0, scale=1.0)
>>> # Near-Normal
>>> approx_normal = StudentT(df=100.0, loc=0.0, scale=1.0)
>>> samples = approx_normal.rsample((500,))

Used by 2

Constructors

dunder

init

→None

__init__(df: Tensor | float, loc: Tensor | float = 0.0, scale: Tensor | float = 1.0, validate_args: bool | None = None)

source edit

Construct a Student's t-distribution.

Parameters

dfTensor | float

Degrees of freedom

\nu > 0

. Lower values produce heavier tails:

\nu = 1

is the Cauchy distribution, while as

\nu \to \infty

the distribution approaches

\mathcal{N}(\mu, \sigma^2)

locTensor | float= 0.0

Location parameter

\mu \in \mathbb{R}

. Default is 0.0.

scaleTensor | float= 1.0

Scale parameter

\sigma > 0

. Default is 1.0.

validate_argsbool | None= None

If True, validate parameter constraints at construction time.

Notes

All three parameters are broadcast against each other so that batches with mixed scalar / tensor parameters work naturally. The resulting batch_shape is the broadcast shape of (df, loc, scale).

Examples

>>> from lucid.distributions import StudentT
>>> d = StudentT(df=5.0, loc=2.0, scale=0.5)
>>> d.mean
Tensor(2.0)

Properties

prop

mean

→Tensor

mean: Tensor

source edit

Expected value of the Student's t-distribution.

E[X] = \mu \quad (\nu > 1)

The mean is only mathematically defined for $\nu > 1$ . For $\nu \leq 1$ (e.g., the Cauchy distribution) the first moment does not exist. Following the convention of most distribution libraries, this property returns $\mu$ unconditionally — callers are responsible for checking $\nu > 1$ when that matters.

Returns

Tensor

Location parameter $\mu$ , shape batch_shape.

Examples

>>> StudentT(df=5.0, loc=3.0).mean
Tensor(3.0)

prop

variance

→Tensor

variance: Tensor

source edit

Variance of the Student's t-distribution.

\operatorname{Var}[X] = \frac{\sigma^2 \nu}{\nu - 2} \quad (\nu > 2)

The variance is only finite for $\nu > 2$ . For $1 < \nu \leq 2$ the distribution has a defined mean but infinite variance. For $\nu \leq 1$ neither moment exists. This property computes $\sigma^2 \nu / (\nu - 2)$ algebraically; the caller must guard against $\nu \leq 2$ as the result will be negative or infinite in those cases.

Returns

Tensor

Variance $\sigma^2 \nu / (\nu - 2)$ , shape batch_shape.

Examples

>>> StudentT(df=4.0, scale=1.0).variance  # 4/(4-2) = 2.0
Tensor(2.0)

Instance methods

entropy

→Tensor

entropy()

source edit

Shannon entropy of the Student's t-distribution (in nats).

H = \log\!\left(\sigma\sqrt{\nu}\, B\!\left(\tfrac{1}{2}, \tfrac{\nu}{2}\right)\right) + \frac{\nu+1}{2} \left[\psi\!\left(\tfrac{\nu+1}{2}\right) - \psi\!\left(\tfrac{\nu}{2}\right)\right]

where $B(a,b) = \Gamma(a)\Gamma(b)/\Gamma(a+b)$ is the Beta function and $\psi$ is the digamma function.

As $\nu \to \infty$ this converges to the entropy of $\mathcal{N}(\mu, \sigma^2)$ , which is $\tfrac{1}{2}\log(2\pi e\sigma^2)$ .

Returns

Tensor

Entropy in nats, shape batch_shape.

Examples

>>> StudentT(df=1.0, loc=0.0, scale=1.0).entropy()  # Cauchy: log(4π) ≈ 2.531
Tensor(2.5310)

log_prob

→Tensor

log_prob(value: Tensor)

source edit

Log-density of value under the Student's t-distribution.

\log p(x; \nu, \mu, \sigma) = \log\Gamma\!\left(\tfrac{\nu+1}{2}\right) - \log\Gamma\!\left(\tfrac{\nu}{2}\right) - \tfrac{1}{2}\log(\pi\nu) - \log\sigma - \tfrac{\nu+1}{2} \log\!\left(1 + \frac{z^2}{\nu}\right)

where $z = (x - \mu) / \sigma$ .

Parameters

valueTensor

Real-valued observations.

Returns

Tensor

Element-wise log-densities, shape batch_shape.

Examples

>>> d = StudentT(df=1.0, loc=0.0, scale=1.0)  # Cauchy
>>> d.log_prob(lucid.tensor(0.0))  # -log(π) ≈ -1.145
Tensor(-1.1447)

rsample

→Tensor

rsample(sample_shape: tuple[int, ...] = ())

source edit

Reparameterised sample: gradient flows through the Normal variate.

T = loc + scale · z / sqrt(g / df) where z ~ N(0,1) is differentiable and g ~ Chi²(df) is detached (standard practice — the marginal gradient w.r.t. df is not tracked).

sample

→Tensor

sample(sample_shape: tuple[int, ...] = ())

source edit

Draw non-differentiable samples from the Student's t-distribution.

Wraps rsample inside a no_grad context so that the returned samples have no gradient history. Use rsample directly when gradient flow through the location-scale reparameterisation is required.

Parameters

sample_shapetuple[int, ...]= ()

Leading shape dimensions for the sample batch. Default is ().

Returns

Tensor

Detached samples of shape sample_shape + batch_shape.

Examples

>>> d = StudentT(df=3.0, loc=0.0, scale=1.0)
>>> x = d.sample((200,))

>>> import lucid >>> from lucid.distributions import StudentT >>> # Heavy-tailed (Cauchy) >>> cauchy = StudentT(df=1.0, loc=0.0, scale=1.0) >>> # Near-Normal >>> approx_normal = StudentT(df=100.0, loc=0.0, scale=1.0) >>> samples = approx_normal.rsample((500,))

dunder

init

→None

__init__(df: Tensor | float, loc: Tensor | float = 0.0, scale: Tensor | float = 1.0, validate_args: bool | None = None)

source edit

Construct a Student's t-distribution.

Parameters

dfTensor | float

Degrees of freedom

\nu > 0

. Lower values produce heavier tails:

\nu = 1

is the Cauchy distribution, while as

\nu \to \infty

the distribution approaches

\mathcal{N}(\mu, \sigma^2)

locTensor | float= 0.0

Location parameter

\mu \in \mathbb{R}

. Default is 0.0.

scaleTensor | float= 1.0

Scale parameter

\sigma > 0

. Default is 1.0.

validate_argsbool | None= None

If True, validate parameter constraints at construction time.

Notes

All three parameters are broadcast against each other so that batches with mixed scalar / tensor parameters work naturally. The resulting batch_shape is the broadcast shape of (df, loc, scale).

Examples

>>> from lucid.distributions import StudentT
>>> d = StudentT(df=5.0, loc=2.0, scale=0.5)
>>> d.mean
Tensor(2.0)