mon
mon

Reputation: 22254

Which numpy index is copy and which is view?

Question

Regarding Numpy Indexing which can either return copy or view, please confirm if my understandings are correct. If not, please provide explanations and provide pointers to the related specifications.

Q: Basic slicing

The Numpy Indexing documentation has Basic Slicing and Indexing section. I believe this section talks only about basic slicing as mentioned one of three ways in Indexing.

There are three kinds of indexing available: field access, basic slicing, advanced indexing. Which one occurs depends on obj.

It only uses Python slice object and it returns view only. Even when a slice is in a tuple, which may fall into the section below, it still returns view.

If one supplies to the index a tuple, the tuple will be interpreted as a list of indices.

This code will return a view.

>>> indices = (1,1,1,slice(0,2)) # same as [1,1,1,0:2]
>>> z[indices]
array([39, 40])

Are these correct understanding?

Q: Combining basic slicing and advance indexing

There is a section in the document:

When there is at least one slice (:), ellipsis (...) or newaxis in the index (or the array has more dimensions than there are advanced indexes), then the behavior can be more complicated.

I believe, basic indexing in the section title means basic slicing. There is no such basic indexing that is different from basic slicing because Indexing says there are only three ways, field access, basic slicing, advanced indexing.

When advanced indexing and basic indexing are combined, it will return copy.

Are these correct understandings?


Background

Numpy indexing has so many options, some return a view others return copy. It is quite confusing to have clear mental classification which operation may return reference/view, or copy.

There are three kinds of indexing available: field access, basic slicing, advanced indexing. Which one occurs depends on obj.

Basic slicing occurs when obj is a slice object (constructed by start:stop:step notation inside of brackets), an integer, or a tuple of slice objects and integers. Ellipsis and newaxis objects can be interspersed with these as well.

NumPy slicing creates a view instead of a copy as in the case of builtin Python sequences such as string, tuple and list.

Advanced indexing is triggered when the selection object, obj, is a non-tuple sequence object, an ndarray (of data type integer or bool), or a tuple with at least one sequence object or ndarray (of data type integer or bool). There are two types of advanced indexing: integer and Boolean.

Advanced indexing always returns a copy of the data (contrast with basic slicing that returns a view).

If one supplies to the index a tuple, the tuple will be interpreted as a list of indices.

z = np.arange(81).reshape(3,3,3,3)
>>> indices = (1,1,1,1)
>>> z[indices]
40

If the ndarray object is a structured array the fields of the array can be accessed by indexing the array with strings, dictionary-like. Indexing x['field-name'] returns a new view to the array


Note

Upvotes: 0

Views: 823

Answers (2)

mon
mon

Reputation: 22254

From Python to Numpy. Nicolas P. Rougier : Views and copies.

First, we have to distinguish between indexing and fancy indexing. The first will always return a view while the second will return a copy. This difference is important because in the first case, modifying the view modifies the base array while this is not true in the second case:

If you are unsure if the result of your indexing is a view or a copy, you can check what is the base of your result. If it is None, then you result is a copy:

>>> Z = np.random.uniform(0,1,(5,5))
>>> Z1 = Z[:3,:]
>>> Z2 = Z[[0,1,2], :]
>>> print(np.allclose(Z1,Z2))
True
>>> print(Z1.base is Z)
True
>>> print(Z2.base is Z)
False
>>> print(Z2.base is None)
True

If the base refers to the source, it is NOT None, hence a view.

Upvotes: 1

Akshay Sehgal
Akshay Sehgal

Reputation: 19322

It's true that in order to get a good grasp of what returns a view and what returns a copy, you need to be thorough with the documentation (which sometimes doesn't really mention it as well). I will not be able to provide you a complete set of operations and their output types (view or copy) however, maybe this could help you on your quest.

You can use np.shares_memory() to check whether a function returns a view or a copy of the original array.

x = np.array([1, 2, 3, 4])
x1 = x
x2 = np.sqrt(x)
x3 = x[1:2]
x4 = x[1::2]
x5 = x.reshape(-1,2)
x6 = x[:,None]
x7 = x[None,:]
x8 = x6+x7
x9 = x5[1,0:2]
x10 = x5[[0,1],0:2]


print(np.shares_memory(x, x1))
print(np.shares_memory(x, x2))
print(np.shares_memory(x, x3))
print(np.shares_memory(x, x4))
print(np.shares_memory(x, x5))
print(np.shares_memory(x, x6))
print(np.shares_memory(x, x7))
print(np.shares_memory(x, x8))
print(np.shares_memory(x, x9))
print(np.shares_memory(x, x10))
True
False
True
True
True
True
True
False
True
False

Notice the last 2 advance+basic indexing examples. One is a view while other is a copy. The explaination of this difference as mentioned in the documentation (also provides insight on how these are implemented) is -

When there is at least one slice (:), ellipsis (...) or newaxis in the index (or the array has more dimensions than there are advanced indexes), then the behaviour can be more complicated. It is like concatenating the indexing result for each advanced index element

Upvotes: 3

Related Questions