Reputation: 459
I'm not very familiar with python. I reading the book 'Python for Data Analysis' recently, and I'm a bit confused about the numpy boolean indexing and setting. The book said:
Selecting data from an array by boolean indexing always creates a copy of the data, even if the returned array is unchanged.
Setting values with boolean arrays works in a common-sense way.
And I have tried it as the follow code:
First:
data = np.random.randn(7, 4)
data[data < 0] = 0 # this could change the `data`
Second:
data = np.random.randn(7, 4)
copied = data[data < 0]
copied[1] = 1 # this couldn't change the `data`
I do not quite understand here, anyone can explain it. In my understanding, copied
should be pointer to the data[data < 0] slices.
Upvotes: 0
Views: 1857
Reputation: 53089
As a rule of thumb numpy creates a view where possible and a copy where necessary.
When is a view possible? When the data can be addressed using strides, i.e. for example for a 2d array A each A[i, j]
sits in memory at address base + i*stride[0] + j*stride[1]
. If you create a subarray using just slices this will always be the case which is why you will get a view.
For logical and advanced indexing it will typically not be possible to find a base and strides which happen to address the right elements. Therefore these operations return a new array with data copied.
Upvotes: 3
Reputation: 14399
While data[data < 0] = 0
sorta looks like a view being set to 0
, that's not what's actually happening. In reality, an ndarray
followed by =
calls __setitem__
which handles the piecewise assingment.
When the ndarray is on the other side of the =
, __setitem__
isn't called and you assign a copy (as boolean indexing always does), which is independent of the original array.
Essentially:
foo[foo != bar] = bar # calls __setitem__
foo[:2] = bar # calls __setitem__
bar = foo[foo != bar] # makes a copy
bar = foo[:2] # makes a view
Upvotes: 5
Reputation: 11
Based on the sequence of the code:
data = np.random.randn(7, 4)
: Thi step creates an array of size 7 by 4data[data < 0] = 0
: makes all the elements in data which are < 0 as 0copied = data[data < 0]
: This step generates an empty array as there is no element in data which is < 0, because of step 4copied[1] = 1
: This step raises an error as copied is an empty array and thus index 1 does not existUpvotes: 1