Reputation: 3
I have a scipy sparse matrix in one variable
which I copy to another new variable
. If I now change the diagonal of the sparse matrix in the new variable
, the sparse matrix in the original variable
updates as well. The same happens if I change the data attribute
. I don't understand why this is, is there a purpose for that behavior that I don't see, or am I doing something in an unintended way?
What I would like to have is that I am starting off with a sparse matrix, make several copies of it, and modify the diagonal or all entries in a sufficient way for these copies. But my modifications are affecting all copies.
I figured out that the problem does not occur if I use methods that create a copy of the sparse matrix, like power()
or matrix-multiplication with @.
One modification I would like to have is a copy of the sparse matrix where I take the absolute value of all entries. If I use abs()
directly on the sparse matrix, it creates a copy as desired, and everything is fine. But if I write the absolute values of all entries into the data attribute
of the sparse matrix, it affects all other copies of the sparse matrix as well. I found the latter method to be considerably faster, that's why I would prefer to use it.
The problem is independent from the sparse matrix format (apart from the data attribute for the lil or dok format).
I tried it with Python 3.5.2
and Python 3.7.3
(two different computers) in Spyder 3.3.1, and I am using scipy version 1.3.0.
Say I have a sparse matrix
from scipy.sparse import csc_matrix as spmat
Msp = spmat(ar([[0.,-3.],[2.,-4.]]))
and I make some copies (I could also always copy Msp, doesn't make a difference)
M1 = Msp
M2 = M1
If I now do
M2.data = abs(M2.data)
or
M1.setdiag([1,1])
it also changes all other copies, e.g. after applying both operations above:
Msp.toarray()
array([[1., 3.],
[2., 1.]])
and the same for M1 and M2.
I would have expected
M2.toarray()
array([[ 0., 3.],
[ 2., 4.]])
and
M1.toarray()
array([[1., -3.],
[2., 1.]])
and
Msp.toarray()
array([[ 0., -3.],
[ 2., -4.]])
On the other hand, if I do something of the following type
M2 = abs(M2)
M2 = M2.power(2)
M2 = M2@M2
it does only affect M2 and leaves M1 and Msp untouched as I would expect.
Upvotes: 0
Views: 473
Reputation: 1216
By changing the following lines:
M1 = Msp
M2 = M1
to:
M1 = Msp.copy()
M2 = M1.copy()
your given example will work as intended.
Numpy arrays are mutable and hence changes to an object will affect all variables which refer to that object.
In other words:
By setting M2 = M1
, M2 is only a reference to M1 and will hence take M1's value also if it is changed. M2 = M1.copy()
on the other hand, passes a copy of M1 (the 'values') to M2 which is thereafter independent of changes to M1.
The reason why the examples given in the end do only affect M2 is that many numpy functions return new arrays which are independent of the arrays which passed as parameters.
Upvotes: 1