Kerrick Staley
Kerrick Staley

Reputation: 1933

Fast way to convert upper triangular matrix into symmetric matrix

I have an upper-triangular matrix of np.float64 values, like this:

array([[ 1.,  2.,  3.,  4.],
       [ 0.,  5.,  6.,  7.],
       [ 0.,  0.,  8.,  9.],
       [ 0.,  0.,  0., 10.]])

I would like to convert this into the corresponding symmetric matrix, like this:

array([[ 1.,  2.,  3.,  4.],
       [ 2.,  5.,  6.,  7.],
       [ 3.,  6.,  8.,  9.],
       [ 4.,  7.,  9., 10.]])

The conversion can be done in place, or as a new matrix. I would like it to be as fast as possible. How can I do this quickly?

Upvotes: 9

Views: 4037

Answers (4)

Soudipta Dutta
Soudipta Dutta

Reputation: 2132

import numpy as np

matrix = upper_triangular_matrix = np.array([[1.,  2.,  3.,  4.],
                     [0.,  5.,  6.,  7.],
                     [0.,  0.,  8.,  9.],
                     [0.,  0.,  0., 10.]])
print(matrix)
'''
[[ 1.  2.  3.  4.]
 [ 0.  5.  6.  7.]
 [ 0.  0.  8.  9.]
 [ 0.  0.  0. 10.]]
'''
'''
Below code, Effectively duplicates the upper triangular part into the lower triangular part, 
resulting in a matrix that is almost symmetric, except for the diagonal elements.
'''
symmetric_matrix = matrix  + matrix.T 
print(symmetric_matrix)

'''
[[ 2.  2.  3.  4.]
 [ 2. 10.  6.  7.]
 [ 3.  6. 16.  9.]
 [ 4.  7.  9. 20.]]
'''
#change the diagonal to the Original matrix 
np.fill_diagonal(symmetric_matrix,np.diag(matrix))
print(symmetric_matrix)
'''
[[ 1.  2.  3.  4.]
 [ 2.  5.  6.  7.]
 [ 3.  6.  8.  9.]
 [ 4.  7.  9. 10.]]
'''

Upvotes: 0

Kerrick Staley
Kerrick Staley

Reputation: 1933

This is the fastest routine I've found so far that doesn't use Cython or a JIT like Numba. I takes about 1.6 μs on my machine to process a 4x4 array (average time over a list of 100K 4x4 arrays):

inds_cache = {}

def upper_triangular_to_symmetric(ut):
    n = ut.shape[0]
    try:
        inds = inds_cache[n]
    except KeyError:
        inds = np.tri(n, k=-1, dtype=np.bool)
        inds_cache[n] = inds
    ut[inds] = ut.T[inds]

Here are some other things I've tried that are not as fast:

The above code, but without the cache. Takes about 8.3 μs per 4x4 array:

def upper_triangular_to_symmetric(ut):
    n = ut.shape[0]
    inds = np.tri(n, k=-1, dtype=np.bool)
    ut[inds] = ut.T[inds]

A plain Python nested loop. Takes about 2.5 μs per 4x4 array:

def upper_triangular_to_symmetric(ut):
    n = ut.shape[0]
    for r in range(1, n):
        for c in range(r):
            ut[r, c] = ut[c, r]

Floating point addition using np.triu. Takes about 11.9 μs per 4x4 array:

def upper_triangular_to_symmetric(ut):
    ut += np.triu(ut, k=1).T

Numba version of Python nested loop. This was the fastest thing I found (about 0.4 μs per 4x4 array), and was what I ended up using in production, at least until I started running into issues with Numba and had to revert back to a pure Python version:

import numba

@numba.njit()
def upper_triangular_to_symmetric(ut):
    n = ut.shape[0]
    for r in range(1, n):
        for c in range(r):
            ut[r, c] = ut[c, r]

Cython version of Python nested loop. I'm new to Cython so this may not be fully optimized. Since Cython adds operational overhead, I'm interested in hearing both Cython and pure-Numpy answers. Takes about 0.6 μs per 4x4 array:

cimport numpy as np
cimport cython

@cython.boundscheck(False)
@cython.wraparound(False)
def upper_triangular_to_symmetric(np.ndarray[np.float64_t, ndim=2] ut):
    cdef int n, r, c
    n = ut.shape[0]
    for r in range(1, n):
        for c in range(r):
            ut[r, c] = ut[c, r]

Upvotes: 4

max9111
max9111

Reputation: 6482

You are mainly measuring function call overhead on such tiny problems

Another way to do that would be to use Numba. Let's start with a implementation for only one (4x4) array.

Only one 4x4 array

import numpy as np
import numba as nb

@nb.njit()
def sym(A):
    for i in range(A.shape[0]):
        for j in range(A.shape[1]):
            A[j,i]=A[i,j]
    return A


A=np.array([[ 1.,  2.,  3.,  4.],
       [ 0.,  5.,  6.,  7.],
       [ 0.,  0.,  8.,  9.],
       [ 0.,  0.,  0., 10.]])

%timeit sym(A)
#277 ns ± 5.21 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

Larger example

@nb.njit(parallel=False)
def sym_3d(A):
    for i in nb.prange(A.shape[0]):
        for j in range(A.shape[1]):
            for k in range(A.shape[2]):
                A[i,k,j]=A[i,j,k]
    return A

A=np.random.rand(1_000_000,4,4)

%timeit sym_3d(A)
#13.8 ms ± 49.5 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
#13.8 ns per 4x4 submatrix

Upvotes: 2

Paul Panzer
Paul Panzer

Reputation: 53029

np.where seems quite fast in the out-of-place, no-cache scenario:

np.where(ut,ut,ut.T)

On my laptop:

timeit(lambda:np.where(ut,ut,ut.T))
# 1.909718865994364

If you have pythran installed you can speed this up 3 times with near zero effort. But note that as far as I know pythran (currently) only understands contguous arrays.

file <upp2sym.py>, compile with pythran -O3 upp2sym.py

import numpy as np

#pythran export upp2sym(float[:,:])

def upp2sym(a):
    return np.where(a,a,a.T)

Timing:

from upp2sym import *

timeit(lambda:upp2sym(ut))
# 0.5760842661838979

This is almost as fast as looping:

#pythran export upp2sym_loop(float[:,:])

def upp2sym_loop(a):
    out = np.empty_like(a)
    for i in range(len(a)):
        out[i,i] = a[i,i]
        for j in range(i):
            out[i,j] = out[j,i] = a[j,i]
    return out

Timing:

timeit(lambda:upp2sym_loop(ut))
# 0.4794591029640287

We can also do it inplace:

#pythran export upp2sym_inplace(float[:,:])

def upp2sym_inplace(a):
    for i in range(len(a)):
        for j in range(i):
            a[i,j] = a[j,i]

Timing

timeit(lambda:upp2sym_inplace(ut))
# 0.28711927914991975

Upvotes: 6

Related Questions