mottalrd
mottalrd

Reputation: 4490

Numpy indexing multidimensional arrays with array and slice

My doubt is about this example in the numpy docs.

y = np.arange(35).reshape(5,7)

This is the operation that I am trying to clarify:

y[np.array([0,2,4]),1:3]

According to the docs:
"In effect, the slice is converted to an index array np.array([[1,2]]) (shape (1,2)) that is broadcast with the index array to produce a resultant array of shape (3,2)."

This does not work, so I am assuming it is not equivalent

y[np.array([0,2,4]), np.array([1,2])]

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-140-f4cd35e70141> in <module>()
----> 1 y[np.array([0,2,4]), np.array([1,2])]

ValueError: shape mismatch: objects cannot be broadcast to a single shape

How does this broadcasted array of shape (3,2) looks like?

Upvotes: 2

Views: 3452

Answers (3)

hpaulj
hpaulj

Reputation: 231335

The broadcasting is more like:

In [280]: y[np.array([0,2,4])[...,None], np.array([1,2])]
Out[280]: 
array([[ 1,  2],
       [15, 16],
       [29, 30]])

I added a dimension to [0,2,4] making it 2d. broadcast_arrays can be used to see what the broadcasted arrays look like:

In [281]: np.broadcast_arrays(np.array([0,2,4])[...,None], np.array([1,2]))
Out[281]: 
[array([[0, 0],
        [2, 2],
        [4, 4]]), 
 array([[1, 2],
        [1, 2],
        [1, 2]])]

np.broadcast_arrays([[0],[2],[4]], [1,2]) is the samething without the array wrappers. np.meshgrid([0,2,4], [1,2], indexing='ij') is another way of producing these indexing arrays.

(the lists produced by meshgrid or broadcast_arrays could be used as the argument for y[_].)

So it's right to say [1,2] is broadcast with the index array, but it omits the bit about adjusting dimensions.

A little earlier they have this example:

y[np.array([0,2,4])]

which is equivalent to y[np.array([0,2,4]), :]. It picks 3 rows, and all items from them. The 1:3 case can be thought of as an extension of this, picking 3 rows, and then 2 columns.

y[[0,2,4],:][:,1:3]

This might be a better way of thinking about the indexing if broadcasting is too confusing.


There's another docs page that might handle this better

http://docs.scipy.org/doc/numpy/reference/arrays.indexing.html

In this docs, basic indexing involves slices and integers

y[:,1:3], y[1,:], y[1, 1:3]

Advanced indexing involves an array (or list)

y[[0,2,4],:]

This produces the same result as y[::2,:], except the list case produces a copy, the slice (basic) a view.

y[[0,2,4], [1,2,3]] is a case of pure advance index array indexing, the result is 3 items, ones at (0,1), (2,2), and (4,3).

y[[0,2,4], 1:3] is a case that this docs calls Combining advanced and basic indexing, 'advanced' from `[0,2,4]', basic from '1:3'.

http://docs.scipy.org/doc/numpy/reference/arrays.indexing.html#combining-advanced-and-basic-indexing


Looking at a more complex index array might add some insight.

In [222]: i=[[0,2],[1,4]]

Used with another list, it is 'pure' advanced, and the result is broadcasted:

In [224]: y[i, [1,2]]
Out[224]: 
array([[ 1, 16],
       [ 8, 30]])

The index arrays are:

In [234]: np.broadcast_arrays(i, [1,2])
Out[234]: 
[array([[0, 2],
        [1, 4]]), 
 array([[1, 2],
        [1, 2]])]

The [1,2] list is just expanded to a (2,2) array.

Using it with a slice is an example of this mixed advanced/basic, and the result is 3d (2,2,2).

In [223]: y[i, 1:3]
Out[223]: 
array([[[ 1,  2],
        [15, 16]],

       [[ 8,  9],
        [29, 30]]])

The equivalent with broadcasting is

y[np.array(i)[...,None], [1,2]]

Upvotes: 1

user707650
user707650

Reputation:

You're right that the documentation may be incorrect here, or at least something is missing. I'd file an issue for that, for clarification in the documentation.

In fact, this part of the documentation shows just this example, but then with the exception you get being raised:

>>> y[np.array([0,2,4]), np.array([0,1])]
<type 'exceptions.ValueError'>: shape mismatch: objects cannot be
broadcast to a single shape

Upvotes: 1

galaxyan
galaxyan

Reputation: 6111

y[data,beginIndex:endIndex]

   import numpy as np
    y = np.arange(35).reshape(5,7)
    print(y)
    [[ 0  1  2  3  4  5  6]
     [ 7  8  9 10 11 12 13]
     [14 15 16 17 18 19 20]
     [21 22 23 24 25 26 27]
     [28 29 30 31 32 33 34]]
    print(y[np.array([0,2,4]),1:3])
    [[ 1  2]
     [15 16]
     [29 30]]

Upvotes: 1

Related Questions