class

SoftmaxTransform

extendsTransform

SoftmaxTransform()

source edit

Softmax transform mapping $\mathbb{R}^K \to \Delta^{K-1}$ .

Pushes an unconstrained vector $\mathbf{x} \in \mathbb{R}^K$ onto the open $K$ -simplex via $\mathbf{y} = \mathrm{softmax}(\mathbf{x})$ . Operates on the last axis; event_dim = 1.

The transform is over-parameterised: any constant shift along the softmax axis ( $\mathbf{x} \to \mathbf{x} + c\mathbf{1}$ ) yields the same $\mathbf{y}$ , so it is not a true bijection. The standard convention used here is to anchor the inverse at $\mathbf{x} = \log \mathbf{y}$ (un-normalised log-probabilities), which is one canonical preimage.

Notes

Forward:

y_k = \frac{e^{x_k}}{\sum_{j=1}^{K} e^{x_j}}

Inverse (canonical anchor):

x_k = \log y_k

Pseudo-Jacobian used in change-of-variable bookkeeping:

\log|\det J| = \sum_{k=1}^{K} \log y_k

This is the convention that keeps the simplex-valued pushforward consistent in flow stacks; for a true bijection between $\mathbb{R}^{K-1}$ and the simplex use StickBreakingTransform instead.

Examples

>>> import lucid
>>> from lucid.distributions.transforms import SoftmaxTransform
>>> T = SoftmaxTransform()
>>> T(lucid.tensor([0.0, 1.0, 2.0]))
Tensor([...])

Used by 1

lucid.distributions

Instance methods

log_abs_det_jacobian

→Tensor

log_abs_det_jacobian(x: Tensor, y: Tensor)

source edit

Pseudo-Jacobian $\sum_k \log y_k$ for the over-parameterised softmax.

Parameters

xTensor

Pre-softmax logits. Used only for shape; the formula collapses through y.

yTensor

Post-softmax probabilities in the simplex.

Returns

Tensor

$\sum_k \log y_k$ , reduced over the last axis.

class

SoftmaxTransform

extendsTransform

SoftmaxTransform()

source edit

Softmax transform mapping $\mathbb{R}^K \to \Delta^{K-1}$ .

Pushes an unconstrained vector $\mathbf{x} \in \mathbb{R}^K$ onto the open $K$ -simplex via $\mathbf{y} = \mathrm{softmax}(\mathbf{x})$ . Operates on the last axis; event_dim = 1.

Notes

Forward:

y_k = \frac{e^{x_k}}{\sum_{j=1}^{K} e^{x_j}}

Inverse (canonical anchor):

x_k = \log y_k

Pseudo-Jacobian used in change-of-variable bookkeeping:

\log|\det J| = \sum_{k=1}^{K} \log y_k

This is the convention that keeps the simplex-valued pushforward consistent in flow stacks; for a true bijection between $\mathbb{R}^{K-1}$ and the simplex use StickBreakingTransform instead.

Examples

>>> import lucid
>>> from lucid.distributions.transforms import SoftmaxTransform
>>> T = SoftmaxTransform()
>>> T(lucid.tensor([0.0, 1.0, 2.0]))
Tensor([...])

Used by 1

lucid.distributions

Instance methods

log_abs_det_jacobian

→Tensor

log_abs_det_jacobian(x: Tensor, y: Tensor)

source edit

Pseudo-Jacobian $\sum_k \log y_k$ for the over-parameterised softmax.

Parameters

xTensor

Pre-softmax logits. Used only for shape; the formula collapses through y.

yTensor

Post-softmax probabilities in the simplex.

Returns

Tensor

$\sum_k \log y_k$ , reduced over the last axis.