Reputation: 1467
I have two sparse
matrices that I want to compare element-wise:
from scipy import sparse as sp
t1 = sp.random(10, 10, 0.5)
t2 = sp.random(10, 10, 0.5)
In particular I would like to make a scatterplot for those elements present (i.e. non-zero) in both matrices, but so far the only way I could think of is to convert them to the dense format:
import matplotlib.pyplot as plt
plt.plot(t1.todense().flatten(),
t2.todense().flatten(),
'ko',
alpha=0.1)
Which works terribly when the matrices are very large. Is there a more efficient way to do this?
Upvotes: 0
Views: 48
Reputation: 231625
In [256]: t1
Out[256]:
<10x10 sparse matrix of type '<class 'numpy.float64'>'
with 50 stored elements in COOrdinate format>
In [257]: t2
Out[257]:
<10x10 sparse matrix of type '<class 'numpy.float64'>'
with 50 stored elements in COOrdinate format>
When plotting t1.todense().flatten()
you plot data points for all elements of t1
, whether zero or not. In this case 100 points.
One way to 'weed' out the zero elements is:
In [258]: t3 = t1.multiply(t2)
In [259]: t3
Out[259]:
<10x10 sparse matrix of type '<class 'numpy.float64'>'
with 28 stored elements in Compressed Sparse Row format>
In [260]: t11 = t3.astype(bool).multiply(t1)
In [261]: t21 = t3.astype(bool).multiply(t2)
In [262]: t11
Out[262]:
<10x10 sparse matrix of type '<class 'numpy.float64'>'
with 28 stored elements in Compressed Sparse Row format>
t3
has nonzero values where both t1
and t2
are nonzero. t11
has the corresponding elements of t1
(t3
floats become boolean True and implicitly 1 in the multiply.) Sparse multiply
is relatively efficient (may be not as much as the corresponding dense multiply or even the sparse matrix multiply).
We could plot t11.todense.ravel()
etc. That would be the same, except for a concentration of values as (0.0, 0.0). But the data
attribute has the nonzero values, and the sparsity of t11
and t21
is the same, so we can just plot those - only 28 values in this case:
plt.plot(t11.data,
t21.data,
'ko',
alpha=0.1);
There may be other ways of getting t11
and t21
matrices, but the basic idea still applies - get two matrices with the same sparsity, and plot just their data
values.
Upvotes: 1