Anaphory
Anaphory

Reputation: 6400

Simultaneously diagonalize matrices with numpy

I have a m × n × n numpy.ndarray of m simultaneously diagonalizable square matrices and would like to use numpy to obtain their simultaneous eigenvalues.

For example, if I had

from numpy import einsum, diag, array, linalg, random
U = linalg.svd(random.random((3,3)))[2]

M = einsum(
    "ij, ajk, lk",
    U, [diag([2,2,0]), diag([1,-1,1])], U)

the two matrices in M are simultaneously diagonalizable, and I am looking for a way to obtain the array

array([[2.,  1.],
       [2., -1.],
       [0.,  1.]])

(up to permutation of the lines) from M. Is there a built-in or easy way to get this?

Upvotes: 10

Views: 16043

Answers (4)

Arun Chaganty
Arun Chaganty

Reputation: 381

There is a fairly simple and very elegant simultaneous diagonalization algorithm based on Givens rotation that was published by Cardoso and Soulomiac in 1996:

Cardoso, J., & Souloumiac, A. (1996). Jacobi Angles for Simultaneous Diagonalization. SIAM Journal on Matrix Analysis and Applications, 17(1), 161–164. doi:10.1137/S0895479893259546

I've attached a numpy implementation of the algorithm at the end of this response. Caveat: It turns out simultaneous diagonalization is a bit of a tricky numerical problem, with no algorithm (to the best of my knowledge) that guarantees global convergence. However, the cases in which it does not work (see the paper) are degenerate and in practice I have never had the Jacobi angles algorithm fail on me.

#!/usr/bin/env python2.7
# -*- coding: utf-8 -*-
"""
Routines for simultaneous diagonalization
Arun Chaganty <[email protected]>
"""

import numpy as np
from numpy import zeros, eye, diag
from numpy.linalg import norm

def givens_rotate( A, i, j, c, s ):
    """
    Rotate A along axis (i,j) by c and s
    """
    Ai, Aj = A[i,:], A[j,:]
    A[i,:], A[j,:] = c * Ai + s * Aj, c * Aj - s * Ai 

    return A

def givens_double_rotate( A, i, j, c, s ):
    """
    Rotate A along axis (i,j) by c and s
    """
    Ai, Aj = A[i,:], A[j,:]
    A[i,:], A[j,:] = c * Ai + s * Aj, c * Aj - s * Ai 
    A_i, A_j = A[:,i], A[:,j]
    A[:,i], A[:,j] = c * A_i + s * A_j, c * A_j - s * A_i 

    return A

def jacobi_angles( *Ms, **kwargs ):
    r"""
    Simultaneously diagonalize using Jacobi angles
    @article{SC-siam,
       HTML =   "ftp://sig.enst.fr/pub/jfc/Papers/siam_note.ps.gz",
       author = "Jean-Fran\c{c}ois Cardoso and Antoine Souloumiac",
       journal = "{SIAM} J. Mat. Anal. Appl.",
       title = "Jacobi angles for simultaneous diagonalization",
       pages = "161--164",
       volume = "17",
       number = "1",
       month = jan,
       year = {1995}}

    (a) Compute Givens rotations for every pair of indices (i,j) i < j 
            - from eigenvectors of G = gg'; g = A_ij - A_ji, A_ij + A_ji
            - Compute c, s as \sqrt{x+r/2r}, y/\sqrt{2r(x+r)}
    (b) Update matrices by multiplying by the givens rotation R(i,j,c,s)
    (c) Repeat (a) until stopping criterion: sin theta < threshold for all ij pairs
    """

    assert len(Ms) > 0
    m, n = Ms[0].shape
    assert m == n

    sweeps = kwargs.get('sweeps', 500)
    threshold = kwargs.get('eps', 1e-8)
    rank = kwargs.get('rank', m)

    R = eye(m)

    for _ in xrange(sweeps):
        done = True
        for i in xrange(rank):
            for j in xrange(i+1, m):
                G = zeros((2,2))
                for M in Ms:
                    g = np.array([ M[i,i] - M[j,j], M[i,j] + M[j,i] ])
                    G += np.outer(g,g) / len(Ms)
                # Compute the eigenvector directly
                t_on, t_off = G[0,0] - G[1,1], G[0,1] + G[1,0]
                theta = 0.5 * np.arctan2( t_off, t_on + np.sqrt( t_on*t_on + t_off * t_off) )
                c, s = np.cos(theta), np.sin(theta)

                if abs(s) > threshold:
                    done = False
                    # Update the matrices and V
                    for M in Ms:
                        givens_double_rotate(M, i, j, c, s)
                        #assert M[i,i] > M[j, j]
                    R = givens_rotate(R, i, j, c, s)

        if done:
            break
    R = R.T

    L = np.zeros((m, len(Ms)))
    err = 0
    for i, M in enumerate(Ms):
        # The off-diagonal elements of M should be 0
        L[:,i] = diag(M)
        err += norm(M - diag(diag(M)))

    return R, L, err

Upvotes: 8

James
James

Reputation: 11

If you know something about the size of the eigenvalues of the two matrices in advance, you can diagonalize a linear combination of the two matrices, with coefficients chosen to break the degeneracy. For example, if the eigenvalues of both lie between -10 and 10, you could diagonalize 100*M1 + M2. There's a slight loss of precision, but for many purposes it's good enough--and quick and easy!

Upvotes: 1

Anaphory
Anaphory

Reputation: 6400

I am sure there is significant room for improvement in my solution, but I have come up with the following set of three functions doing the calculation for me in a semi-robust way.

def clusters(array,
             orig_indices = None,
             start = 0,
             rtol=numpy.allclose.__defaults__[0],
             atol=numpy.allclose.__defaults__[1]):
    """For an array, return a permutation that sorts the numbers and the sizes of the resulting blocks of identical numbers."""
    array = numpy.asarray(array)
    if not len(array):
        return numpy.array([]),[]
    if orig_indices is None:
        orig_indices = numpy.arange(len(array))
    x = array[0]
    close = abs(array-x) <= (atol + rtol*abs(x))
    first = sum(close)
    r_perm, r_sizes = clusters(
        array[~close],
        orig_indices[~close],
        start+first,
        rtol, atol)
    r_sizes.insert(0, first)
    return numpy.concatenate((orig_indices[close], r_perm)), r_sizes

def permutation_matrix(permutation, dtype=dtype):
    n = len(permutation)
    P = numpy.zeros((n,n), dtype)
    for i,j in enumerate(permutation):
        P[j,i]=1
    return P

def simultaneously_diagonalize(tensor, atol=numpy.allclose.__defaults__[1]):
    tensor = numpy.asarray(tensor)
    old_shape = tensor.shape
    size = old_shape[-1]
    tensor = tensor.reshape((-1, size, size))
    diag_mask = 1-numpy.eye(size)

    eigvalues, diagonalizer = numpy.linalg.eig(tensor[0])
    diagonalization = numpy.dot(
        numpy.dot(
            matrix.linalg.inv(diagonalizer),
            tensor).swapaxes(0,-2),
        diagonalizer)
    if numpy.allclose(diag_mask*diagonalization, 0):
        return diagonalization.diagonal(axis1=-2, axis2=-1).reshape(old_shape[:-1])
    else:
        perm, cluster_sizes = clusters(diagonalization[0].diagonal())
        perm_matrix = permutation_matrix(perm)
        diagonalization = numpy.dot(
            numpy.dot(
                perm_matrix.T,
                diagonalization).swapaxes(0,-2),
            perm_matrix)
        mask = 1-scipy.linalg.block_diag(
            *list(
                numpy.ones((blocksize, blocksize))
                for blocksize in cluster_sizes))
        print(diagonalization)
        assert(numpy.allclose(
                diagonalization*mask,
                0)) # Assert that the matrices are co-diagonalizable
        blocks = numpy.cumsum(cluster_sizes)
        start = 0
        other_part = []
        for block in blocks:
            other_part.append(
                simultaneously_diagonalize(
                    diagonalization[1:, start:block, start:block]))
            start = block
        return numpy.vstack(
            (diagonalization[0].diagonal(axis1=-2, axis2=-1),
             numpy.hstack(other_part)))

Upvotes: 1

B&#225;lint Aradi
B&#225;lint Aradi

Reputation: 3812

I am not aware of any direct solution. But why not just getting the eigenvalues and the eigenvectors of the first matrix, and using the eigenvectors to transform all other matrices to the diagonal form? Something like:

eigvals, eigvecs = np.linalg.eig(matrix1)
eigvals2 = np.diagonal(np.dot(np.dot(transpose(eigvecs), matrix2), eigvecs))

You can the add the columns to an array via hstack if you like.

UPDATE: As pointed out below, this is only valid if no degenerate eigenvalues occur. Otherwise one would have to check first for the degenerate eigenvalues, then transform the 2nd matrix to a blockdiagonal form, and diagonalize eventual blocks bigger than 1x1 separately.

Upvotes: 3

Related Questions