class

StudentT

extendsDistribution
StudentT(df: Tensor | float, loc: Tensor | float = 0.0, scale: Tensor | float = 1.0, validate_args: bool | None = None)
source

Student's t-distribution with location, scale, and degrees of freedom.

StudentT(df=ν, loc=μ, scale=σ) defines the three-parameter location-scale generalisation of Student's t. It arises naturally in:

  • Bayesian inference: the posterior predictive for a Normal likelihood with unknown mean and variance (Normal-InverseGamma conjugate model).
  • Robust regression: as a heavy-tailed alternative to the Normal for outlier-tolerant models.
  • Limit behaviour: as ν\nu \to \infty, the t-distribution converges to N(μ,σ2)\mathcal{N}(\mu, \sigma^2).

Parameters

dfTensor | float
Degrees of freedom ν>0\nu > 0. Controls tail heaviness. ν=1\nu = 1 is the Cauchy distribution; ν=\nu = \infty is the Normal.
locTensor | float= 0.0
Location parameter μR\mu \in \mathbb{R}. Default is 0.0.
scaleTensor | float= 1.0
Scale parameter σ>0\sigma > 0. Default is 1.0.
validate_argsbool | None= None
If True, validate parameter constraints at construction time.

Attributes

dfTensor
Degrees of freedom ν\nu.
locTensor
Location parameter μ\mu.
scaleTensor
Scale parameter σ\sigma.

Notes

PDF:

p(x;ν,μ,σ)=Γ ⁣(ν+12)Γ ⁣(ν2)πνσ(1+(xμ)2νσ2)(ν+1)/2p(x; \nu, \mu, \sigma) = \frac{\Gamma\!\left(\frac{\nu+1}{2}\right)} {\Gamma\!\left(\frac{\nu}{2}\right) \sqrt{\pi \nu}\, \sigma} \left(1 + \frac{(x-\mu)^2}{\nu \sigma^2}\right)^{-(\nu+1)/2}

Log-PDF:

logp(x)=logΓ ⁣(ν+12)logΓ ⁣(ν2)12log(πν)logσν+12log ⁣(1+z2ν)\log p(x) = \log\Gamma\!\left(\frac{\nu+1}{2}\right) - \log\Gamma\!\left(\frac{\nu}{2}\right) - \tfrac{1}{2}\log(\pi\nu) - \log\sigma - \frac{\nu+1}{2}\log\!\left(1 + \frac{z^2}{\nu}\right)

where z=(xμ)/σz = (x - \mu)/\sigma.

Moments:

  • Mean (ν>1\nu > 1): E[X]=μE[X] = \mu
  • Variance (ν>2\nu > 2): Var[X]=σ2ν/(ν2)\operatorname{Var}[X] = \sigma^2 \nu / (\nu - 2)
  • The distribution has no finite variance for ν2\nu \leq 2 and no finite mean for ν1\nu \leq 1.

Reparameterised sampling uses the representation

T=μ+σzν/gT = \mu + \sigma \cdot z \cdot \sqrt{\nu / g}

where zN(0,1)z \sim \mathcal{N}(0,1) is the differentiable variate and gχ2(ν)g \sim \chi^2(\nu) is detached (the Gamma sampler uses rejection sampling whose path is not differentiable w.r.t. ν\nu).

Examples

>>> import lucid
>>> from lucid.distributions import StudentT
>>> # Heavy-tailed (Cauchy)
>>> cauchy = StudentT(df=1.0, loc=0.0, scale=1.0)
>>> # Near-Normal
>>> approx_normal = StudentT(df=100.0, loc=0.0, scale=1.0)
>>> samples = approx_normal.rsample((500,))

Methods (7)

dunder

__init__

None
__init__(df: Tensor | float, loc: Tensor | float = 0.0, scale: Tensor | float = 1.0, validate_args: bool | None = None)
source

Construct a Student's t-distribution.

Parameters

dfTensor | float
Degrees of freedom ν>0\nu > 0. Lower values produce heavier tails: ν=1\nu = 1 is the Cauchy distribution, while as ν\nu \to \infty the distribution approaches N(μ,σ2)\mathcal{N}(\mu, \sigma^2).
locTensor | float= 0.0
Location parameter μR\mu \in \mathbb{R}. Default is 0.0.
scaleTensor | float= 1.0
Scale parameter σ>0\sigma > 0. Default is 1.0.
validate_argsbool | None= None
If True, validate parameter constraints at construction time.

Notes

All three parameters are broadcast against each other so that batches with mixed scalar / tensor parameters work naturally. The resulting batch_shape is the broadcast shape of (df, loc, scale).

Examples

>>> from lucid.distributions import StudentT
>>> d = StudentT(df=5.0, loc=2.0, scale=0.5)
>>> d.mean
Tensor(2.0)
prop

mean

Tensor
mean: Tensor
source

Expected value of the Student's t-distribution.

E[X]=μ(ν>1)E[X] = \mu \quad (\nu > 1)

The mean is only mathematically defined for ν>1\nu > 1. For ν1\nu \leq 1 (e.g., the Cauchy distribution) the first moment does not exist. Following the convention of most distribution libraries, this property returns μ\mu unconditionally — callers are responsible for checking ν>1\nu > 1 when that matters.

Returns

Tensor

Location parameter μ\mu, shape batch_shape.

Examples

>>> StudentT(df=5.0, loc=3.0).mean
Tensor(3.0)
prop

variance

Tensor
variance: Tensor
source

Variance of the Student's t-distribution.

Var[X]=σ2νν2(ν>2)\operatorname{Var}[X] = \frac{\sigma^2 \nu}{\nu - 2} \quad (\nu > 2)

The variance is only finite for ν>2\nu > 2. For 1<ν21 < \nu \leq 2 the distribution has a defined mean but infinite variance. For ν1\nu \leq 1 neither moment exists. This property computes σ2ν/(ν2)\sigma^2 \nu / (\nu - 2) algebraically; the caller must guard against ν2\nu \leq 2 as the result will be negative or infinite in those cases.

Returns

Tensor

Variance σ2ν/(ν2)\sigma^2 \nu / (\nu - 2), shape batch_shape.

Examples

>>> StudentT(df=4.0, scale=1.0).variance  # 4/(4-2) = 2.0
Tensor(2.0)
fn

rsample

Tensor
rsample(sample_shape: tuple[int, ...] = ())
source

Reparameterised sample: gradient flows through the Normal variate.

T = loc + scale · z / sqrt(g / df) where z ~ N(0,1) is differentiable and g ~ Chi²(df) is detached (standard practice — the marginal gradient w.r.t. df is not tracked).

fn

sample

Tensor
sample(sample_shape: tuple[int, ...] = ())
source

Draw non-differentiable samples from the Student's t-distribution.

Wraps rsample inside a no_grad context so that the returned samples have no gradient history. Use rsample directly when gradient flow through the location-scale reparameterisation is required.

Parameters

sample_shapetuple[int, ...]= ()
Leading shape dimensions for the sample batch. Default is ().

Returns

Tensor

Detached samples of shape sample_shape + batch_shape.

Examples

>>> d = StudentT(df=3.0, loc=0.0, scale=1.0)
>>> x = d.sample((200,))
fn

log_prob

Tensor
log_prob(value: Tensor)
source

Log-density of value under the Student's t-distribution.

logp(x;ν,μ,σ)=logΓ ⁣(ν+12)logΓ ⁣(ν2)12log(πν)logσν+12log ⁣(1+z2ν)\log p(x; \nu, \mu, \sigma) = \log\Gamma\!\left(\tfrac{\nu+1}{2}\right) - \log\Gamma\!\left(\tfrac{\nu}{2}\right) - \tfrac{1}{2}\log(\pi\nu) - \log\sigma - \tfrac{\nu+1}{2} \log\!\left(1 + \frac{z^2}{\nu}\right)

where z=(xμ)/σz = (x - \mu) / \sigma.

Parameters

valueTensor
Real-valued observations.

Returns

Tensor

Element-wise log-densities, shape batch_shape.

Examples

>>> d = StudentT(df=1.0, loc=0.0, scale=1.0)  # Cauchy
>>> d.log_prob(lucid.tensor(0.0))  # -log(π) ≈ -1.145
Tensor(-1.1447)
fn

entropy

Tensor
entropy()
source

Shannon entropy of the Student's t-distribution (in nats).

H=log ⁣(σνB ⁣(12,ν2))+ν+12[ψ ⁣(ν+12)ψ ⁣(ν2)]H = \log\!\left(\sigma\sqrt{\nu}\, B\!\left(\tfrac{1}{2}, \tfrac{\nu}{2}\right)\right) + \frac{\nu+1}{2} \left[\psi\!\left(\tfrac{\nu+1}{2}\right) - \psi\!\left(\tfrac{\nu}{2}\right)\right]

where B(a,b)=Γ(a)Γ(b)/Γ(a+b)B(a,b) = \Gamma(a)\Gamma(b)/\Gamma(a+b) is the Beta function and ψ\psi is the digamma function.

As ν\nu \to \infty this converges to the entropy of N(μ,σ2)\mathcal{N}(\mu, \sigma^2), which is 12log(2πeσ2)\tfrac{1}{2}\log(2\pi e\sigma^2).

Returns

Tensor

Entropy in nats, shape batch_shape.

Examples

>>> StudentT(df=1.0, loc=0.0, scale=1.0).entropy()  # Cauchy: log(4π) ≈ 2.531
Tensor(2.5310)