Reputation: 24758
I have:
>>> a
array([[1, 2],
[3, 4]])
>>> type(l), l # list of scalers
(<type 'list'>, [0, 1])
>>> type(i), i # a numpy array
(<type 'numpy.ndarray'>, array([0, 1]))
>>> type(j), j # list of numpy arrays
(<type 'list'>, [array([0, 1]), array([0, 1])])
When I do
>>> a[l] # Case 1, l is a list of scalers
I get
array([[1, 2],
[3, 4]])
which means indexing happened only on 0th axis.
But when I do
>>> a[j] # Case 2, j is a list of numpy arrays
I get
array([1, 4])
which means indexing happened along axis 0 and axis 1.
Q1: When used for indexing, why is there a difference in treatment of list of scalers and list of numpy arrays ? (Case 1 vs Case 2). In Case 2, I was hoping to see indexing happen only along axis 0 and get
array( [[[1,2],
[3,4]],
[[1,2],
[3,4]]])
Now, when using numpy array of arrays instead
>>> j1 = np.array(j) # numpy array of arrays
The result below indicates that indexing happened only along axis 0 (as expected)
>>> a[j1] Case 3, j1 is a numpy array of numpy arrays
array([[[1, 2],
[3, 4]],
[[1, 2],
[3, 4]]])
Q2: When used for indexing, why is there a difference in treatment of list of numpy arrays and numpy array of numpy arrays? (Case 2 vs Case 3)
Upvotes: 7
Views: 5640
Reputation: 231375
Case1, a[l]
is actually a[(l,)]
which expands to a[(l, slice(None))]
. That is, indexing the first dimension with the list l
, and an automatic trailing :
slice. Indices are passed as a tuple to the array __getitem__
, and extra ()
may be added without confusion.
Case2, a[j]
is treated as a[array([0, 1]), array([0, 1]]
or a[(array(([0, 1]), array([0, 1])]
. In other words, as a tuple of indexing objects, one per dimension. It ends up returning a[0,0]
and a[1,1]
.
Case3, a[j1]
is a[(j1, slice(None))]
, applying the j1
index to just the first dimension.
Case2 is a bit of any anomaly. Your intuition is valid, but for historical reasons, this list of arrays (or list of lists) is interpreted as a tuple of arrays.
This has been discussed in other SO questions, and I think it is documented. But off hand I can't find those references.
So it's safer to use either a tuple of indexing objects, or an array. Indexing with a list has a potential ambiguity.
numpy array indexing: list index and np.array index give different result
This SO question touches on the same issue, though the clearest statement of what is happening is buried in a code link in a comment by @user2357112.
Another way of forcing the Case3 like indexing, make the 2nd dimension slice explicit, a[j,:]
In [166]: a[j]
Out[166]: array([1, 4])
In [167]: a[j,:]
Out[167]:
array([[[1, 2],
[3, 4]],
[[1, 2],
[3, 4]]])
(I often include the trailing :
even if it isn't needed. It makes it clear to me, and readers, how many dimensions we are working with.)
Upvotes: 2
Reputation: 20214
A1: The structure of l
is not the same as j
.
l
is just one-dimension while j
is two-dimension. If you change one of them:
# l = [0, 1] # just one dimension!
l = [[0, 1], [0, 1]] # two dimensions
j = [np.array([0,1]), np.array([0, 1])] # two dimensions
They have the same behave.
A2: The same, the structure of arrays in Case 2 and Case 3 are not the same.
Upvotes: 0