lucid.linalg.pinv¶

lucid.linalg.pinv(a: Tensor, /, rcond: float = 1e-12) → Tensor¶

The pinv function computes the Moore-Penrose pseudo-inverse of a matrix. Given any matrix A, it returns the pseudo-inverse A⁺, which generalizes the concept of the inverse to non-square or singular matrices.

Function Signature¶

def pinv(a: Tensor) -> Tensor

Parameters¶

a (Tensor): The input tensor, which can be any two-dimensional tensor (matrix), regardless of shape or rank.

Returns¶

Tensor: The pseudo-inverse of the input matrix a.

Forward Calculation¶

The forward calculation for pinv computes the Moore-Penrose pseudo-inverse using Singular Value Decomposition (SVD):

Compute the SVD of A:

\[\mathbf{A} = \mathbf{U} \mathbf{\Sigma} \mathbf{V}^\mathrm{T}\]

where:
- \(\mathbf{U}\) is an orthogonal matrix containing the left singular vectors.
- \(\mathbf{\Sigma}\) is a diagonal matrix containing the singular values (non-negative real numbers).
- \(\mathbf{V}\) is an orthogonal matrix containing the right singular vectors.
Compute the reciprocal of non-zero singular values:

\[\mathbf{\Sigma}^+ = \operatorname{diag}\left( \frac{1}{\sigma_i} \right)\]

for each non-zero singular value σᵢ. Singular values close to zero can be regularized to avoid numerical instability.
Compute the pseudo-inverse:

\[\mathbf{A}^+ = \mathbf{V} \mathbf{\Sigma}^+ \mathbf{U}^\mathrm{T}\]

Backward Gradient Calculation¶

The gradient of the pseudo-inverse with respect to the input matrix A is computed using advanced matrix calculus involving the SVD components.

The derivative is given by:

\[\frac{\partial \mathbf{A}^+}{\partial \mathbf{A}} = -\mathbf{A}^+ \otimes \mathbf{A}^+ + \mathbf{A}^+ \left( \mathbf{A}^{+\mathrm{T}} \otimes \left( \mathbf{I} - \mathbf{A} \mathbf{A}^+ \right) + \left( \mathbf{I} - \mathbf{A}^+ \mathbf{A} \right) \otimes \mathbf{A}^{+\mathrm{T}} \right) \mathbf{A}^+\]

In practical implementation, this involves:

Compute the SVD of A (from the forward pass).
Compute the necessary intermediate matrices:
- \(\mathbf{S}^{-2}\): Diagonal matrix with elements \(1/\sigma_i^2\).
- Orthogonal projectors:
  
  \[\mathbf{P}_U = \mathbf{U} \mathbf{U}^\mathrm{T}, \quad \mathbf{P}_V = \mathbf{V} \mathbf{V}^\mathrm{T}\]
Compute the gradient:

\[\frac{\partial L}{\partial \mathbf{A}} = -\mathbf{A}^+ \frac{\partial L}{\partial \mathbf{A}^+} \mathbf{A}^+ + \mathbf{P}_V \frac{\partial L}{\partial \mathbf{A}^+}^\mathrm{T} \left( \mathbf{I} - \mathbf{A} \mathbf{A}^+ \right ) + \left( \mathbf{I} - \mathbf{A}^+ \mathbf{A} \right ) \frac{\partial L}{\partial \mathbf{A}^+}^\mathrm{T} \mathbf{P}_U\]

where \(\frac{\partial L}{\partial \mathbf{A}^+}\) is the gradient of the loss function with respect to the pseudo-inverse A⁺.

Explanation:

The gradient involves several terms accounting for the non-square and potentially singular nature of A.
It ensures that during backpropagation, the gradients are correctly propagated through the pseudo-inverse operation.

Raises¶

Attention

LinAlgError: If the SVD computation does not converge during the calculation of the pseudo-inverse or its gradient.
ValueError: If the input tensor a is not a two-dimensional tensor.

Example¶

>>> import lucid
>>> a = lucid.Tensor([[1.0, 2.0], [3.0, 4.0], [5.0, 6.0]])
>>> result = lucid.linalg.pinv(a)
>>> print(result)
Tensor([[-1.33333333, -0.33333333,  0.66666667],
        [ 1.08333333,  0.33333333, -0.41666667]])

Note

The pseudo-inverse is particularly useful for solving least squares problems and systems of linear equations that do not have a unique solution.
The function supports backpropagation, allowing it to be used in optimization problems involving pseudo-inverses.
Numerical stability is maintained by handling singular values close to zero appropriately during the computation.

Additional Details¶

Singular Values Near Zero:
- Small singular values can cause numerical instability due to division by very small numbers.
- In practice, a threshold or regularization parameter may be used to avoid dividing by zero or extremely small values.
Applications:
- The pseudo-inverse is widely used in machine learning, statistics, and engineering for solving ill-posed problems.
- It is essential in computing solutions to linear systems that are underdetermined or overdetermined.
Performance Considerations:
- Computing the SVD can be computationally intensive for large matrices.
- For performance-critical applications, consider using approximations or specialized algorithms optimized for large-scale computations.