TheDoctorsKid
TheDoctorsKid

Reputation: 1

How to index elements from a column of a ndarray such that the output is a column vector?

I have an nx2 array of points represented as a ndarray. I want to index some of the elements (indices are given in a ndarray as well) of one of the two column vectors such that the output is a column vector. If however the index array contains only one index, a (1,)-shaped array should be returned.

I already tried the following things without success:

import numpy as np

points = np.array([[0, 1], [1, 1.5], [2.5, 0.5], [4, 1], [5, 2]])
index = np.array([0, 1, 2])

points[index, [0]] -> array([0. , 1. , 2.5]) -> shape (3,)
points[[index], 0] -> array([[0. , 1. , 2.5]]) -> shape (1, 3)
points[[index], [0]] -> array([[0. , 1. , 2.5]]) -> shape (1, 3)
points[index, 0, np.newaxis] -> array([[0. ], [1. ], [2.5]]) -> shape(3, 1) # desired

np.newaxis works for this scenario however if the index array only contains one value it does not deliver the right shape:

import numpy as np

points = np.array([[0, 1], [1, 1.5], [2.5, 0.5], [4, 1], [5, 2]])
index = np.array([0])

points[index, 0, np.newaxis] -> array([[0.]]) -> shape (1, 1)
points[index, [0]] -> array([0.]) -> shape (1,) # desired

Is there possibility to index the ndarray such that the output has shapes (3,1) for the first example and (1,) for the second example without doing case differentiations based on the size of the index array?

Thanks in advance for your help!

Upvotes: 0

Views: 34

Answers (2)

hpaulj
hpaulj

Reputation: 231605

In [329]: points = np.array([[0, 1], [1, 1.5], [2.5, 0.5], [4, 1], [5, 2]]) 
     ...: index = np.array([0, 1, 2])  

We can select 3 rows with:

In [330]: points[index,:]                                                                                    
Out[330]: 
array([[0. , 1. ],
       [1. , 1.5],
       [2.5, 0.5]])

However if we select a column as well, the result is 1d, even if we use [0]. That's because the (3,) row index is broadcast against the (1,) column index, resulting in a (3,) result:

In [331]: points[index,0]                                                                                    
Out[331]: array([0. , 1. , 2.5])
In [332]: points[index,[0]]                                                                                  
Out[332]: array([0. , 1. , 2.5])

If we make row index (3,1) shape, the result also (3,1):

In [333]: points[index[:,None],[0]]                                                                          
Out[333]: 
array([[0. ],
       [1. ],
       [2.5]])
In [334]: points[index[:,None],0]                                                                            
Out[334]: 
array([[0. ],
       [1. ],
       [2.5]])

We get the same thing if we use a row slice:

In [335]: points[0:3,[0]]                                                                                    
Out[335]: 
array([[0. ],
       [1. ],
       [2.5]])

Using [index] doesn't help because it makes the row index (1,3) shape, resulting in a (1,3) result. Of course you could transpose it to get (3,1).

With a 1 element index:

In [336]: index1 = np.array([0])                                                                             
In [337]: points[index1[:,None],0]                                                                           
Out[337]: array([[0.]])
In [338]: _.shape                                                                                            
Out[338]: (1, 1)
In [339]: points[index1,0]                                                                                   
Out[339]: array([0.])
In [340]: _.shape                                                                                            
Out[340]: (1,)

If the row index was a scalar, as opposed to 1d:

In [341]: index1 = np.array(0)                                                                               
In [342]: points[index1[:,None],0]                                                                           
...
IndexError: too many indices for array
In [343]: points[index1[...,None],0]        # use ... instead                                                                 
Out[343]: array([0.])
In [344]: points[index1, 0]     # scalar result                                                                             
Out[344]: 0.0

I think handling the np.array([0]) case separately requires an if test. At least I can't think of a builtin numpy way of burying it.

Upvotes: 1

Joseph Holland
Joseph Holland

Reputation: 144

I'm not certain I understand the wording in your question, but it seems as though you may be after the ndarray.swapaxes method (see https://docs.scipy.org/doc/numpy-1.13.0/reference/generated/numpy.ndarray.swapaxes.html#numpy.ndarray.swapaxes)

for your snippet:

points = np.array([[0, 1], [1, 1.5], [2.5, 0.5], [4, 1], [5, 2]])
swapped = points.swapaxes(0,1)
print(swapped)

gives

[[0.  1.  2.5 4.  5. ]
 [1.  1.5 0.5 1.  2. ]]

Upvotes: 0

Related Questions