Reputation: 46530
I am slowly trying to understand the difference between view
s and copy
s in numpy, as well as mutable vs. immutable types.
If I access part of an array with 'advanced indexing' it is supposed to return a copy. This seems to be true:
In [1]: import numpy as np
In [2]: a = np.zeros((3,3))
In [3]: b = np.array(np.identity(3), dtype=bool)
In [4]: c = a[b]
In [5]: c[:] = 9
In [6]: a
Out[6]:
array([[ 0., 0., 0.],
[ 0., 0., 0.],
[ 0., 0., 0.]])
Since c
is just a copy, it does not share data and changing it does not mutate a
. However, this is what confuses me:
In [7]: a[b] = 1
In [8]: a
Out[8]:
array([[ 1., 0., 0.],
[ 0., 1., 0.],
[ 0., 0., 1.]])
So, it seems, even if I use advanced indexing, assignment still treats the thing on the left as a view. Clearly the a
in line 2 is the same object/data as the a
in line 6, since mutating c
has no effect on it.
So my question: is the a
in line 8 the same object/data as before (not counting the diagonal of course) or is it a copy? In other words, was a
's data copied to the new a
, or was its data mutated in place?
For example, is it like:
x = [1,2,3]
x += [4]
or like:
y = (1,2,3)
y += (4,)
I don't know how to check for this because in either case, a.flags.owndata
is True
. Please feel free to elaborate or answer a different question if I'm thinking about this in a confusing way.
Upvotes: 15
Views: 1843
Reputation: 1833
This seems to be common misunderstanding, quoting from the official document: (https://scipy-cookbook.readthedocs.io/items/ViewsVsCopies.html)
The rule of thumb here can be: in the context of lvalue indexing (i.e. the indices are placed in the left hand side value of an assignment), no view or copy of the array is created (because there is no need to). However, with regular values, the above rules for creating views does apply.
In other words, the notion of view
or copy
only refers to the situation of retrieving values from a numpy
object.
Upvotes: 3
Reputation: 67427
When you do c = a[b]
, a.__get_item__
is called with b
as its only argument, and whatever gets returned is assigned to c
.
When you doa[b] = c
, a.__setitem__
is called with b
and c
as arguments and whatever gets returned is silently discarded.
So despite having the same a[b]
syntax, both expressions are doing different things. You could subclass ndarray
, overload this two functions, and have them behave differently. As is by default in numpy, the former returns a copy (if b
is an array) but the latter modifies a
in place.
Upvotes: 12
Reputation: 251398
Yes, it is the same object. Here's how you check:
>>> a
array([[ 0., 0., 0.],
[ 0., 0., 0.],
[ 0., 0., 0.]])
>>> a2 = a
>>> a[b] = 1
>>> a2 is a
True
>>> a2
array([[ 1., 0., 0.],
[ 0., 1., 0.],
[ 0., 0., 1.]])
Assigning to some expression in Python is not the same as just reading the value of that expression. When you do c = a[b]
, with a[b]
on the right of the equals sign, it returns a new object. When you do a[b] = 1
, with a[b]
on the left of the equals sign, it modifies the original object.
In fact, an expression like a[b] = 1
cannot change what name a
is bound to. The code that handles obj[index] = value
only gets to know the object obj
, not what name was used to refer to that object, so it can't change what that name refers to.
Upvotes: 4