Reputation: 1933
I have an upper-triangular matrix of np.float64
values, like this:
array([[ 1., 2., 3., 4.],
[ 0., 5., 6., 7.],
[ 0., 0., 8., 9.],
[ 0., 0., 0., 10.]])
I would like to convert this into the corresponding symmetric matrix, like this:
array([[ 1., 2., 3., 4.],
[ 2., 5., 6., 7.],
[ 3., 6., 8., 9.],
[ 4., 7., 9., 10.]])
The conversion can be done in place, or as a new matrix. I would like it to be as fast as possible. How can I do this quickly?
Upvotes: 9
Views: 4037
Reputation: 2132
import numpy as np
matrix = upper_triangular_matrix = np.array([[1., 2., 3., 4.],
[0., 5., 6., 7.],
[0., 0., 8., 9.],
[0., 0., 0., 10.]])
print(matrix)
'''
[[ 1. 2. 3. 4.]
[ 0. 5. 6. 7.]
[ 0. 0. 8. 9.]
[ 0. 0. 0. 10.]]
'''
'''
Below code, Effectively duplicates the upper triangular part into the lower triangular part,
resulting in a matrix that is almost symmetric, except for the diagonal elements.
'''
symmetric_matrix = matrix + matrix.T
print(symmetric_matrix)
'''
[[ 2. 2. 3. 4.]
[ 2. 10. 6. 7.]
[ 3. 6. 16. 9.]
[ 4. 7. 9. 20.]]
'''
#change the diagonal to the Original matrix
np.fill_diagonal(symmetric_matrix,np.diag(matrix))
print(symmetric_matrix)
'''
[[ 1. 2. 3. 4.]
[ 2. 5. 6. 7.]
[ 3. 6. 8. 9.]
[ 4. 7. 9. 10.]]
'''
Upvotes: 0
Reputation: 1933
This is the fastest routine I've found so far that doesn't use Cython or a JIT like Numba. I takes about 1.6 μs on my machine to process a 4x4 array (average time over a list of 100K 4x4 arrays):
inds_cache = {}
def upper_triangular_to_symmetric(ut):
n = ut.shape[0]
try:
inds = inds_cache[n]
except KeyError:
inds = np.tri(n, k=-1, dtype=np.bool)
inds_cache[n] = inds
ut[inds] = ut.T[inds]
Here are some other things I've tried that are not as fast:
The above code, but without the cache. Takes about 8.3 μs per 4x4 array:
def upper_triangular_to_symmetric(ut):
n = ut.shape[0]
inds = np.tri(n, k=-1, dtype=np.bool)
ut[inds] = ut.T[inds]
A plain Python nested loop. Takes about 2.5 μs per 4x4 array:
def upper_triangular_to_symmetric(ut):
n = ut.shape[0]
for r in range(1, n):
for c in range(r):
ut[r, c] = ut[c, r]
Floating point addition using np.triu
. Takes about 11.9 μs per 4x4 array:
def upper_triangular_to_symmetric(ut):
ut += np.triu(ut, k=1).T
Numba version of Python nested loop. This was the fastest thing I found (about 0.4 μs per 4x4 array), and was what I ended up using in production, at least until I started running into issues with Numba and had to revert back to a pure Python version:
import numba
@numba.njit()
def upper_triangular_to_symmetric(ut):
n = ut.shape[0]
for r in range(1, n):
for c in range(r):
ut[r, c] = ut[c, r]
Cython version of Python nested loop. I'm new to Cython so this may not be fully optimized. Since Cython adds operational overhead, I'm interested in hearing both Cython and pure-Numpy answers. Takes about 0.6 μs per 4x4 array:
cimport numpy as np
cimport cython
@cython.boundscheck(False)
@cython.wraparound(False)
def upper_triangular_to_symmetric(np.ndarray[np.float64_t, ndim=2] ut):
cdef int n, r, c
n = ut.shape[0]
for r in range(1, n):
for c in range(r):
ut[r, c] = ut[c, r]
Upvotes: 4
Reputation: 6482
Another way to do that would be to use Numba. Let's start with a implementation for only one (4x4) array.
Only one 4x4 array
import numpy as np
import numba as nb
@nb.njit()
def sym(A):
for i in range(A.shape[0]):
for j in range(A.shape[1]):
A[j,i]=A[i,j]
return A
A=np.array([[ 1., 2., 3., 4.],
[ 0., 5., 6., 7.],
[ 0., 0., 8., 9.],
[ 0., 0., 0., 10.]])
%timeit sym(A)
#277 ns ± 5.21 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
Larger example
@nb.njit(parallel=False)
def sym_3d(A):
for i in nb.prange(A.shape[0]):
for j in range(A.shape[1]):
for k in range(A.shape[2]):
A[i,k,j]=A[i,j,k]
return A
A=np.random.rand(1_000_000,4,4)
%timeit sym_3d(A)
#13.8 ms ± 49.5 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
#13.8 ns per 4x4 submatrix
Upvotes: 2
Reputation: 53029
np.where
seems quite fast in the out-of-place, no-cache scenario:
np.where(ut,ut,ut.T)
On my laptop:
timeit(lambda:np.where(ut,ut,ut.T))
# 1.909718865994364
If you have pythran installed you can speed this up 3 times with near zero effort. But note that as far as I know pythran (currently) only understands contguous arrays.
file <upp2sym.py>
, compile with pythran -O3 upp2sym.py
import numpy as np
#pythran export upp2sym(float[:,:])
def upp2sym(a):
return np.where(a,a,a.T)
Timing:
from upp2sym import *
timeit(lambda:upp2sym(ut))
# 0.5760842661838979
This is almost as fast as looping:
#pythran export upp2sym_loop(float[:,:])
def upp2sym_loop(a):
out = np.empty_like(a)
for i in range(len(a)):
out[i,i] = a[i,i]
for j in range(i):
out[i,j] = out[j,i] = a[j,i]
return out
Timing:
timeit(lambda:upp2sym_loop(ut))
# 0.4794591029640287
We can also do it inplace:
#pythran export upp2sym_inplace(float[:,:])
def upp2sym_inplace(a):
for i in range(len(a)):
for j in range(i):
a[i,j] = a[j,i]
Timing
timeit(lambda:upp2sym_inplace(ut))
# 0.28711927914991975
Upvotes: 6