Reputation: 21533
arr = np.arange(0,11)
slice_of_arr = arr[0:6]
slice_of_arr[:]=99
# slice_of_arr returns
array([99, 99, 99, 99, 99, 99])
# arr returns
array([99, 99, 99, 99, 99, 99, 6, 7, 8, 9, 10])
As the example shown above, you cannot directly change the value of the slice_of_arr
, because it's a view of arr
, not a new variable.
My questions are:
.copy
and then assign value?.copy
? How can I change this default behavior of NumPy?Upvotes: 8
Views: 932
Reputation: 3550
I think you have the answers in the other comments, but more specifically:
1.a. Why does NumPy design like this?
Because it's way faster (constant time) to create a view rather than creating a whole array (linear time).
1.b. Wouldn't it be tedious every time you need to .copy and then assign value?
Actually it's not that common to need to create a copy. So no, it's not tedious. Even if it can be surprising at first this design is very good.
2.a. Is there anything I can do, to get rid of the .copy?
I can't really tell without seing real code. In the toy example you give, you can't avoid creating a copy, but in real code you usually apply functions to the data, which return another array so a copy isn't needed.
Can you give an example of real code where you need to call .copy
repeatedly ?
2.b. How can I change this default behavior of NumPy?
You can't. Try to get used to it, you'll see how powerfull it is.
Upvotes: 3
Reputation: 231385
What does (numpy) __array_wrap__ do?
talks about ndarray
subclasses and hooks like __array_wrap__
. np.array
takes copy
parameter, forcing the result to be a copy, even if it isn't required by other considerations. ravel
returns a view, flatten
a copy. So it is probably possible, and maybe not too difficult, to construct a ndarray
subclass that forces a copy. It may involve modifying a hook like __array_wrap__
.
Or maybe modifying the .__getitem__
method. Indexing as in slice_of_arr = arr[0:6]
involves a call to __getitem__
. For ndarray
this is compiled, but for a masked array, it is python code that you could use as an example:
/usr/lib/python3/dist-packages/numpy/ma/core.py
It may be something as simple as
def __getitem__(self, indx):
"""x.__getitem__(y) <==> x[y]
"""
# _data = ndarray.view(self, ndarray) # change to:
_data = ndarray.copy(self, ndarray)
dout = ndarray.__getitem__(_data, indx)
return dout
But I suspect that by the time you develop and fully test such a subclass, you might fall in love with the default no-copy approach. While this view-v-copy business bites many new comers (especially if coming from MATLAB), I haven't seen complaints from experienced users. Look at other numpy SO questions; you won't see a lot copy()
calls.
Even regular Python users are used asking themselves whether a reference or slice is a copy or not, and whether something is mutable or not.
for example with lists:
In [754]: ll=[1,2,[3,4,5],6]
In [755]: llslice=ll[1:-1]
In [756]: llslice[1][1:2]=[10,11,12]
In [757]: ll
Out[757]: [1, 2, [3, 10, 11, 12, 5], 6]
modifying an item an item inside a slice modifies that same item in the original list. In contrast to numpy
, a list slice is a copy. But it's a shallow copy. You have to take extra effort to make a deep copy (import copy
).
/usr/lib/python3/dist-packages/numpy/lib/index_tricks.py
contains some indexing functions aimed at making certain indexing operations more convenient. Several are actually classes, or class instances, with custom __getitem__
methods. They may also serve as models of how to customize your slicing and indexing.
Upvotes: 1