class

SoftmaxTransform

extendsTransform
SoftmaxTransform()
source

Softmax transform mapping RKΔK1\mathbb{R}^K \to \Delta^{K-1}.

Pushes an unconstrained vector xRK\mathbf{x} \in \mathbb{R}^K onto the open KK-simplex via y=softmax(x)\mathbf{y} = \mathrm{softmax}(\mathbf{x}). Operates on the last axis; event_dim = 1.

The transform is over-parameterised: any constant shift along the softmax axis (xx+c1\mathbf{x} \to \mathbf{x} + c\mathbf{1}) yields the same y\mathbf{y}, so it is not a true bijection. The standard convention used here is to anchor the inverse at x=logy\mathbf{x} = \log \mathbf{y} (un-normalised log-probabilities), which is one canonical preimage.

Notes

Forward:

yk=exkj=1Kexjy_k = \frac{e^{x_k}}{\sum_{j=1}^{K} e^{x_j}}

Inverse (canonical anchor):

xk=logykx_k = \log y_k

Pseudo-Jacobian used in change-of-variable bookkeeping:

logdetJ=k=1Klogyk\log|\det J| = \sum_{k=1}^{K} \log y_k

This is the convention that keeps the simplex-valued pushforward consistent in flow stacks; for a true bijection between RK1\mathbb{R}^{K-1} and the simplex use StickBreakingTransform instead.

Examples

>>> import lucid
>>> from lucid.distributions.transforms import SoftmaxTransform
>>> T = SoftmaxTransform()
>>> T(lucid.tensor([0.0, 1.0, 2.0]))
Tensor([...])

Methods (1)

fn

log_abs_det_jacobian

Tensor
log_abs_det_jacobian(x: Tensor, y: Tensor)
source

Pseudo-Jacobian klogyk\sum_k \log y_k for the over-parameterised softmax.