class

Bernoulli

extendsExponentialFamily

Bernoulli(probs: Tensor | float | None = None, logits: Tensor | float | None = None, validate_args: bool | None = None)

source edit

Bernoulli distribution over $\{0, 1\}$ .

The Bernoulli is the simplest discrete distribution: a single coin flip with success probability $p \in [0, 1]$ . It models a binary outcome and is the building block of the Binomial (sum of n IID Bernoullis), the Categorical (its multinomial generalisation), and most classification likelihoods in supervised learning.

Specify exactly one of probs or logits; the other is derived lazily via the sigmoid / logit transform so there is no redundant storage and the parameterisation chosen at construction remains the canonical one for autograd.

Parameters

probsTensor or float= None

Success probability

p \in [0, 1]

. Mutually exclusive with logits.

logitsTensor or float= None

Log-odds

\ell = \log(p / (1 - p)) \in \mathbb{R}

. Mutually exclusive with probs.

validate_argsbool= None

If True, validate parameter constraints at construction time.

Notes

Probability mass function on $x \in \{0, 1\}$ :

P(X = k \mid p) = p^k (1 - p)^{1-k}

Moments:

\mathbb{E}[X] = p, \qquad \mathrm{Var}[X] = p(1 - p), \qquad H[X] = -p\log p - (1-p)\log(1-p)

The variance is maximised at $p = 0.5$ (maximum uncertainty) and vanishes at the degenerate endpoints $p \in \{0, 1\}$ .

Relation to other distributions:

$\mathrm{Binomial}(n, p)$ is the sum of $n$ independent $\mathrm{Bernoulli}(p)$ draws.
$\mathrm{Geometric}(p)$ counts Bernoulli failures before the first success.
$\mathrm{Categorical}$ generalises Bernoulli to $K > 2$ categories.

Conjugate prior: lucid.distributions.Beta — observing $k$ successes out of $n$ updates Beta(α, β) → Beta(α + k, β + n - k).

Examples

>>> import lucid
>>> from lucid.distributions import Bernoulli
>>> d = Bernoulli(probs=0.7)
>>> d.mean
Tensor(0.7)
>>> d.sample((4,))
Tensor([...])
>>> d.log_prob(lucid.tensor(1.0))
Tensor(-0.3567)

Used by 2

Constructors

dunder

init

→None

__init__(probs: Tensor | float | None = None, logits: Tensor | float | None = None, validate_args: bool | None = None)

source edit

Construct a Bernoulli distribution.

Exactly one of probs or logits must be supplied. The other is derived on demand via the sigmoid / logit transform, so there is no redundant storage.

Parameters

probsTensor | float | None= None

Success probability

p \in [0, 1]

. Mutually exclusive with logits.

logitsTensor | float | None= None

Log-odds

\ell = \log(p / (1-p)) \in \mathbb{R}

. Mutually exclusive with probs.

validate_argsbool | None= None

If True, validate parameter constraints at construction time.

Raises

ValueError

If both or neither of probs / logits are provided.

Examples

>>> from lucid.distributions import Bernoulli
>>> d = Bernoulli(probs=0.7)
>>> d.mean
Tensor(0.7)
>>> d2 = Bernoulli(logits=0.0)  # p = 0.5
>>> d2.probs  # derived lazily
Tensor(0.5)

Properties

prop

mean

→Tensor

mean: Tensor

source edit

Expected value of the distribution.

For a Bernoulli random variable $X \in \{0, 1\}$ with success probability $p$ :

E[X] = p

Returns

Tensor

Success probability $p$ , shape batch_shape.

Examples

>>> Bernoulli(probs=0.3).mean
Tensor(0.3)

prop

param

→Tensor

param: Tensor

source edit

The canonical parameter used at construction time.

Returns the logits tensor when the distribution was constructed with logits, or the probs tensor when constructed with probs. This is used by ExponentialFamily machinery to access the sufficient statistic parameter without forcing a conversion.

Returns

Tensor

Either self.logits or self.probs, depending on which was provided at construction.

Examples

>>> d = Bernoulli(probs=0.3)
>>> d.param  # returns self.probs
Tensor(0.3)

prop

variance

→Tensor

variance: Tensor

source edit

Variance of the distribution.

For a Bernoulli random variable $X$ with success probability $p$ :

\operatorname{Var}[X] = p(1 - p)

The variance is maximised at $p = 0.5$ and equals zero at the degenerate extremes $p \in \{0, 1\}$ .

Returns

Tensor

Variance $p(1-p)$ , shape batch_shape.

Examples

>>> Bernoulli(probs=0.5).variance
Tensor(0.25)

Instance methods

entropy

→Tensor

entropy()

source edit

Shannon entropy of the Bernoulli distribution (in nats).

H(X) = -p \log p - (1-p) \log(1-p)

Computed in the numerically stable softplus form:

H = \log(1 + e^\ell) - p \cdot \ell

where $\ell$ is the log-odds. The entropy is maximised at $p = 0.5$ (maximum uncertainty) and is zero at the degenerate cases $p \in \{0, 1\}$ .

Returns

Tensor

Entropy in nats, shape batch_shape.

Examples

>>> Bernoulli(probs=0.5).entropy()  # log(2) ≈ 0.693
Tensor(0.6931)

log_prob

→Tensor

log_prob(value: Tensor)

source edit

Log-probability of value under the Bernoulli distribution.

Uses the numerically stable logits form to avoid $\log(0)$ :

\log p(x \mid \ell) = x \cdot \ell - \log(1 + e^\ell)

where $\ell = \log(p / (1-p))$ is the log-odds. This is equivalent to the cross-entropy form $x \log p + (1-x) \log(1-p)$ but avoids numerical issues at the boundaries $p \in \{0, 1\}$ .

Parameters

valueTensor

Observed values. Should be in

\{0, 1\}

Returns

Tensor

Element-wise log-probabilities, shape batch_shape.

Examples

>>> d = Bernoulli(probs=0.7)
>>> d.log_prob(lucid.tensor(1.0))  # log(0.7)
Tensor(-0.3567)

sample

→Tensor

sample(sample_shape: tuple[int, ...] = ())

source edit

Draw samples from the Bernoulli distribution.

Samples by comparing uniform noise $U \sim \text{Uniform}(0,1)$ against the success probability $p$ :

X = \mathbf{1}[U < p]

The returned tensor has integer-valued entries in $\{0, 1\}$ stored as floating-point and is detached from the autograd graph.

Parameters

sample_shapetuple[int, ...]= ()

Leading shape dimensions for the sample batch. The full output shape is sample_shape + batch_shape. Default is ().

Returns

Tensor

Binary samples of shape sample_shape + batch_shape.

Examples

>>> d = Bernoulli(probs=0.6)
>>> x = d.sample((1000,))
>>> x.mean()  # approximately 0.6

class

Bernoulli

extendsExponentialFamily

Bernoulli(probs: Tensor | float | None = None, logits: Tensor | float | None = None, validate_args: bool | None = None)

source edit

Bernoulli distribution over $\{0, 1\}$ .

Parameters

probsTensor or float= None

Success probability

p \in [0, 1]

. Mutually exclusive with logits.

logitsTensor or float= None

Log-odds

\ell = \log(p / (1 - p)) \in \mathbb{R}

. Mutually exclusive with probs.

validate_argsbool= None

If True, validate parameter constraints at construction time.

Notes

Probability mass function on $x \in \{0, 1\}$ :

P(X = k \mid p) = p^k (1 - p)^{1-k}

Moments:

\mathbb{E}[X] = p, \qquad \mathrm{Var}[X] = p(1 - p), \qquad H[X] = -p\log p - (1-p)\log(1-p)

The variance is maximised at $p = 0.5$ (maximum uncertainty) and vanishes at the degenerate endpoints $p \in \{0, 1\}$ .

Relation to other distributions:

$\mathrm{Binomial}(n, p)$ is the sum of $n$ independent $\mathrm{Bernoulli}(p)$ draws.
$\mathrm{Geometric}(p)$ counts Bernoulli failures before the first success.
$\mathrm{Categorical}$ generalises Bernoulli to $K > 2$ categories.

Conjugate prior: lucid.distributions.Beta — observing $k$ successes out of $n$ updates Beta(α, β) → Beta(α + k, β + n - k).

Examples

>>> import lucid
>>> from lucid.distributions import Bernoulli
>>> d = Bernoulli(probs=0.7)
>>> d.mean
Tensor(0.7)
>>> d.sample((4,))
Tensor([...])
>>> d.log_prob(lucid.tensor(1.0))
Tensor(-0.3567)

Used by 2

Constructors

dunder

init

→None

__init__(probs: Tensor | float | None = None, logits: Tensor | float | None = None, validate_args: bool | None = None)

source edit

Construct a Bernoulli distribution.

Exactly one of probs or logits must be supplied. The other is derived on demand via the sigmoid / logit transform, so there is no redundant storage.

Parameters

probsTensor | float | None= None

Success probability

p \in [0, 1]

. Mutually exclusive with logits.

logitsTensor | float | None= None

Log-odds

\ell = \log(p / (1-p)) \in \mathbb{R}

. Mutually exclusive with probs.

validate_argsbool | None= None

If True, validate parameter constraints at construction time.

Raises

ValueError

If both or neither of probs / logits are provided.

Examples

>>> from lucid.distributions import Bernoulli
>>> d = Bernoulli(probs=0.7)
>>> d.mean
Tensor(0.7)
>>> d2 = Bernoulli(logits=0.0)  # p = 0.5
>>> d2.probs  # derived lazily
Tensor(0.5)

Properties

prop

mean

→Tensor

mean: Tensor

source edit

Expected value of the distribution.

For a Bernoulli random variable $X \in \{0, 1\}$ with success probability $p$ :

E[X] = p

Returns

Tensor

Success probability $p$ , shape batch_shape.

Examples

>>> Bernoulli(probs=0.3).mean
Tensor(0.3)

prop

param

→Tensor

param: Tensor

source edit

The canonical parameter used at construction time.

Returns

Tensor

Either self.logits or self.probs, depending on which was provided at construction.

Examples

>>> d = Bernoulli(probs=0.3)
>>> d.param  # returns self.probs
Tensor(0.3)

prop

variance

→Tensor

variance: Tensor

source edit

Variance of the distribution.

For a Bernoulli random variable $X$ with success probability $p$ :

\operatorname{Var}[X] = p(1 - p)

The variance is maximised at $p = 0.5$ and equals zero at the degenerate extremes $p \in \{0, 1\}$ .

Returns

Tensor

Variance $p(1-p)$ , shape batch_shape.

Examples

>>> Bernoulli(probs=0.5).variance
Tensor(0.25)

Instance methods

entropy

→Tensor

entropy()

source edit

Shannon entropy of the Bernoulli distribution (in nats).

H(X) = -p \log p - (1-p) \log(1-p)

Computed in the numerically stable softplus form:

H = \log(1 + e^\ell) - p \cdot \ell

where $\ell$ is the log-odds. The entropy is maximised at $p = 0.5$ (maximum uncertainty) and is zero at the degenerate cases $p \in \{0, 1\}$ .

Returns

Tensor

Entropy in nats, shape batch_shape.

Examples

>>> Bernoulli(probs=0.5).entropy()  # log(2) ≈ 0.693
Tensor(0.6931)

log_prob

→Tensor

log_prob(value: Tensor)

source edit

Log-probability of value under the Bernoulli distribution.

Uses the numerically stable logits form to avoid $\log(0)$ :

\log p(x \mid \ell) = x \cdot \ell - \log(1 + e^\ell)

where $\ell = \log(p / (1-p))$ is the log-odds. This is equivalent to the cross-entropy form $x \log p + (1-x) \log(1-p)$ but avoids numerical issues at the boundaries $p \in \{0, 1\}$ .

Parameters

valueTensor

Observed values. Should be in

\{0, 1\}

Returns

Tensor

Element-wise log-probabilities, shape batch_shape.

Examples

>>> d = Bernoulli(probs=0.7)
>>> d.log_prob(lucid.tensor(1.0))  # log(0.7)
Tensor(-0.3567)

sample

→Tensor

sample(sample_shape: tuple[int, ...] = ())

source edit

Draw samples from the Bernoulli distribution.

Samples by comparing uniform noise $U \sim \text{Uniform}(0,1)$ against the success probability $p$ :

X = \mathbf{1}[U < p]

The returned tensor has integer-valued entries in $\{0, 1\}$ stored as floating-point and is detached from the autograd graph.

Parameters

sample_shapetuple[int, ...]= ()

Leading shape dimensions for the sample batch. The full output shape is sample_shape + batch_shape. Default is ().

Returns

Tensor

Binary samples of shape sample_shape + batch_shape.

Examples

>>> d = Bernoulli(probs=0.6)
>>> x = d.sample((1000,))
>>> x.mean()  # approximately 0.6