class

Bernoulli

extendsExponentialFamily
Bernoulli(probs: Tensor | float | None = None, logits: Tensor | float | None = None, validate_args: bool | None = None)
source

Bernoulli distribution over {0,1}\{0, 1\}.

The Bernoulli is the simplest discrete distribution: a single coin flip with success probability p[0,1]p \in [0, 1]. It models a binary outcome and is the building block of the Binomial (sum of n IID Bernoullis), the Categorical (its multinomial generalisation), and most classification likelihoods in supervised learning.

Specify exactly one of probs or logits; the other is derived lazily via the sigmoid / logit transform so there is no redundant storage and the parameterisation chosen at construction remains the canonical one for autograd.

Parameters

probsTensor or float= None
Success probability p[0,1]p \in [0, 1]. Mutually exclusive with logits.
logitsTensor or float= None
Log-odds =log(p/(1p))R\ell = \log(p / (1 - p)) \in \mathbb{R}. Mutually exclusive with probs.
validate_argsbool= None
If True, validate parameter constraints at construction time.

Notes

Probability mass function on x{0,1}x \in \{0, 1\}:

P(X=kp)=pk(1p)1kP(X = k \mid p) = p^k (1 - p)^{1-k}

Moments:

E[X]=p,Var[X]=p(1p),H[X]=plogp(1p)log(1p)\mathbb{E}[X] = p, \qquad \mathrm{Var}[X] = p(1 - p), \qquad H[X] = -p\log p - (1-p)\log(1-p)

The variance is maximised at p=0.5p = 0.5 (maximum uncertainty) and vanishes at the degenerate endpoints p{0,1}p \in \{0, 1\}.

Relation to other distributions:

  • Binomial(n,p)\mathrm{Binomial}(n, p) is the sum of nn independent Bernoulli(p)\mathrm{Bernoulli}(p) draws.
  • Geometric(p)\mathrm{Geometric}(p) counts Bernoulli failures before the first success.
  • Categorical\mathrm{Categorical} generalises Bernoulli to K>2K > 2 categories.

Conjugate prior: lucid.distributions.Beta — observing kk successes out of nn updates Beta(α, β) → Beta(α + k, β + n - k).

Examples

>>> import lucid
>>> from lucid.distributions import Bernoulli
>>> d = Bernoulli(probs=0.7)
>>> d.mean
Tensor(0.7)
>>> d.sample((4,))
Tensor([...])
>>> d.log_prob(lucid.tensor(1.0))
Tensor(-0.3567)

Methods (7)

dunder

__init__

None
__init__(probs: Tensor | float | None = None, logits: Tensor | float | None = None, validate_args: bool | None = None)
source

Construct a Bernoulli distribution.

Exactly one of probs or logits must be supplied. The other is derived on demand via the sigmoid / logit transform, so there is no redundant storage.

Parameters

probsTensor | float | None= None
Success probability p[0,1]p \in [0, 1]. Mutually exclusive with logits.
logitsTensor | float | None= None
Log-odds =log(p/(1p))R\ell = \log(p / (1-p)) \in \mathbb{R}. Mutually exclusive with probs.
validate_argsbool | None= None
If True, validate parameter constraints at construction time.

Raises

ValueError
If both or neither of probs / logits are provided.

Examples

>>> from lucid.distributions import Bernoulli
>>> d = Bernoulli(probs=0.7)
>>> d.mean
Tensor(0.7)
>>> d2 = Bernoulli(logits=0.0)  # p = 0.5
>>> d2.probs  # derived lazily
Tensor(0.5)
prop

param

Tensor
param: Tensor
source

The canonical parameter used at construction time.

Returns the logits tensor when the distribution was constructed with logits, or the probs tensor when constructed with probs. This is used by ExponentialFamily machinery to access the sufficient statistic parameter without forcing a conversion.

Returns

Tensor

Either self.logits or self.probs, depending on which was provided at construction.

Examples

>>> d = Bernoulli(probs=0.3)
>>> d.param  # returns self.probs
Tensor(0.3)
prop

mean

Tensor
mean: Tensor
source

Expected value of the distribution.

For a Bernoulli random variable X{0,1}X \in \{0, 1\} with success probability pp:

E[X]=pE[X] = p

Returns

Tensor

Success probability pp, shape batch_shape.

Examples

>>> Bernoulli(probs=0.3).mean
Tensor(0.3)
prop

variance

Tensor
variance: Tensor
source

Variance of the distribution.

For a Bernoulli random variable XX with success probability pp:

Var[X]=p(1p)\operatorname{Var}[X] = p(1 - p)

The variance is maximised at p=0.5p = 0.5 and equals zero at the degenerate extremes p{0,1}p \in \{0, 1\}.

Returns

Tensor

Variance p(1p)p(1-p), shape batch_shape.

Examples

>>> Bernoulli(probs=0.5).variance
Tensor(0.25)
fn

sample

Tensor
sample(sample_shape: tuple[int, ...] = ())
source

Draw samples from the Bernoulli distribution.

Samples by comparing uniform noise UUniform(0,1)U \sim \text{Uniform}(0,1) against the success probability pp:

X=1[U<p]X = \mathbf{1}[U < p]

The returned tensor has integer-valued entries in {0,1}\{0, 1\} stored as floating-point and is detached from the autograd graph.

Parameters

sample_shapetuple[int, ...]= ()
Leading shape dimensions for the sample batch. The full output shape is sample_shape + batch_shape. Default is ().

Returns

Tensor

Binary samples of shape sample_shape + batch_shape.

Examples

>>> d = Bernoulli(probs=0.6)
>>> x = d.sample((1000,))
>>> x.mean()  # approximately 0.6
fn

log_prob

Tensor
log_prob(value: Tensor)
source

Log-probability of value under the Bernoulli distribution.

Uses the numerically stable logits form to avoid log(0)\log(0):

logp(x)=xlog(1+e)\log p(x \mid \ell) = x \cdot \ell - \log(1 + e^\ell)

where =log(p/(1p))\ell = \log(p / (1-p)) is the log-odds. This is equivalent to the cross-entropy form xlogp+(1x)log(1p)x \log p + (1-x) \log(1-p) but avoids numerical issues at the boundaries p{0,1}p \in \{0, 1\}.

Parameters

valueTensor
Observed values. Should be in {0,1}\{0, 1\}.

Returns

Tensor

Element-wise log-probabilities, shape batch_shape.

Examples

>>> d = Bernoulli(probs=0.7)
>>> d.log_prob(lucid.tensor(1.0))  # log(0.7)
Tensor(-0.3567)
fn

entropy

Tensor
entropy()
source

Shannon entropy of the Bernoulli distribution (in nats).

H(X)=plogp(1p)log(1p)H(X) = -p \log p - (1-p) \log(1-p)

Computed in the numerically stable softplus form:

H=log(1+e)pH = \log(1 + e^\ell) - p \cdot \ell

where \ell is the log-odds. The entropy is maximised at p=0.5p = 0.5 (maximum uncertainty) and is zero at the degenerate cases p{0,1}p \in \{0, 1\}.

Returns

Tensor

Entropy in nats, shape batch_shape.

Examples

>>> Bernoulli(probs=0.5).entropy()  # log(2) ≈ 0.693
Tensor(0.6931)