Reputation: 159
Here is my first code snippet. When run, it doesn't throw an assertion error.
import numpy as np
this_arr = np.ones(10)
next_arr = this_arr
next_arr *= 2
assert np.array_equal(this_arr, next_arr)
Here is my second code snippet. When run, it does throw an assertion error.
import numpy as np
this_arr = np.ones(10)
next_arr = this_arr
next_arr = next_arr * 2
assert np.array_equal(this_arr, next_arr)
This behavior is confusing to me.
My understanding of the first code snippet is that I initialize the name this_arr
to point to the value at some memory location. Then, when I initialize the name next_arr
to point to the same value at the same memory location. Therefore, when I change the value pointed to by next_arr
, the value pointed to by this_arr
should also change. This behavior is the "Mutable-Presto-Chango," which was coined by Ned Batchelder here.
However, the second code snippet, does not behave this way. At first, I thought that maybe the *=
operator somehow doesn't change the value's location in memory while the *
operator does. But then I went back through the first snippet and found that the memory locations of this_arr
and next_arr
are different here too! Given that, how does the program "know" to change the values of this_arr
to match those of the changed next_arr
? Also, why doesn't the program "know" to change the values in the second code snippet?
Edit: As a followup question: So even though next_arr and this_arr have different memory locations, there is some underlying connection between the two that python has initialized?
Thanks!
Upvotes: 1
Views: 3953
Reputation: 231540
I prefer to talk in terms of objects and references, rather than values. So I would describe your first code as:
This creates a ndarray
object, and assigns it (or a reference to it) to this_arr
:
this_arr = np.ones(10)
and assign the same reference to next_arr
:
next_arr = this_arr
So next_arr
and this_arr
reference the same object.
Then do an 'in-place' change to the array object. It doesn't matter which name is used.
next_arr *= 2
The two names still reference the same array object. (under the covers does *=
some buffering, but the array object and data buffer location remain the same). Another mutuable change would be next_arr[1] = 10
(this would true for list objects as well).
With
next_arr = next_arr * 2
the multiplication makes a new array object. That is assigned to next_arr
, breaking any links with the previously reference object (which this_arr
still references).
If id(this_arr)
and id(next_arr)
are the same, then the reference the object. Roughly the id
is a location - but not the same as a pointer in c
. But be wary about comparing the ids over time; they may be reused.
arr.__array_interface__
is another handy tool. If has a data
key that tells us where the underlying data buffer of an array is located. But to understand that you need to know something about how arrays are stored, and the distinction between view
and copy
.
Upvotes: 2
Reputation: 674
when you initialize next_arr=this_arr what it actually does is it copies the values of this_arr's location to a new location of next_arr. Its my understanding about this code or else this behaviour won't be possible
Upvotes: -1