lucid.linalg.matrix_power¶

lucid.linalg.matrix_power(a: Tensor, /, n: int) → Tensor¶

The matrix_power function computes the integer power of a square matrix. Given a square matrix A and an integer exponent n, it returns A raised to the power n, denoted as \(\mathbf{A}^n\).

Function Signature¶

def matrix_power(a: Tensor, n: int) -> Tensor

Parameters¶

a (Tensor): The input tensor, which must be a square matrix (a two-dimensional tensor with equal dimensions).
n (int): The exponent to which the matrix is to be raised. Can be any integer (positive, zero, or negative).

Returns¶

Tensor: The resulting tensor after raising the input matrix a to the power n.

Forward Calculation¶

The forward calculation for matrix_power computes the matrix power of A based on the exponent n:

For \(n > 0\):

The function computes the matrix multiplied by itself \(n\) times:

\[\mathbf{A}^n = \underbrace{\mathbf{A} \times \mathbf{A} \times \dots \times \mathbf{A}}_{n \text{ times}}\]
For \(n = 0\):

The function returns the identity matrix of the same dimension as A:

\[\mathbf{A}^0 = \mathbf{I}\]
For \(n < 0\):

The function computes the inverse of A raised to the absolute value of n:

\[\mathbf{A}^n = \left( \mathbf{A}^{-1} \right)^{|n|}\]

Backward Gradient Calculation¶

The gradient of the matrix power with respect to the input matrix A is computed differently based on the value of n:

For \(n \ne 0\):

The gradient is calculated using the chain rule applied to matrix multiplication:

\[\frac{\partial \mathbf{A}^n}{\partial \mathbf{A}} = \sum_{k=0}^{|n|-1} \mathbf{A}^{n - s(k + 1)} \cdot \frac{\partial \mathbf{A}^n}{\partial \mathbf{A}^n} \cdot \mathbf{A}^{s k}\]

where:
- \(s = \operatorname{sign}(n)\) (i.e., \(s = 1\) if \(n > 0\), \(s = -1\) if \(n < 0\)).
- \(\frac{\partial \mathbf{A}^n}{\partial \mathbf{A}^n}\) is the gradient of the output with respect to itself (often denoted as grad_output in code).
Explanation:
- The summation accounts for each occurrence of A in the sequence of multiplications.
- For negative exponents, the gradient includes an additional negative sign due to the properties of matrix inversion.
For \(n = 0\):

Since the output is the identity matrix, which does not depend on A, the gradient with respect to A is zero:

\[\frac{\partial \mathbf{A}^0}{\partial \mathbf{A}} = \mathbf{0}\]

These gradients are essential for backpropagation in optimization algorithms, enabling models that use matrix powers to learn from data.

Raises¶

Attention

ValueError: If the input tensor a is not a square matrix (i.e., it is not two-dimensional or its dimensions are not equal).
LinAlgError: If the matrix inverse cannot be computed when \(n < 0\) (e.g., if a is singular or not invertible).

Example¶

>>> import lucid
>>> a = lucid.Tensor([[2.0, 0.0], [0.0, 2.0]])
>>> result = lucid.linalg.matrix_power(a, 3)
>>> print(result)
Tensor([[8.0, 0.0],
        [0.0, 8.0]])

>>> result = lucid.linalg.matrix_power(a, -1)
>>> print(result)
Tensor([[0.5, 0.0],
        [0.0, 0.5]])

>>> result = lucid.linalg.matrix_power(a, 0)
>>> print(result)
Tensor([[1.0, 0.0],
        [0.0, 1.0]])

Note

The input tensor a must be a square matrix with shape \((n, n)\).
The exponent n can be any integer, including zero and negative integers.
For negative exponents, the function computes the inverse of a before raising it to the power \(|n|\).
If a is not invertible (i.e., singular), a LinAlgError will be raised when n is negative.
The function does not support non-integer exponents.
This function does not support batch processing; each input must be a single square matrix.