Donbeo
Donbeo

Reputation: 17617

numpy select submatrix according to index

I want to select a submatrix according to some columns index and rows index.

I get a strange error. I am able to slice the matrix according to the rows index of to the columns index but not according to both at the same time.

How can I solve that?

>>> X.shape
(1000, 30)
>>> type(X)
<class 'numpy.ndarray'>
>>> X
array([[ 0.06349252, -0.19222932, -0.51720414, ...,  0.17566853,
         0.15821072,  0.0478738 ],
       [ 0.88497758,  0.22215627,  1.63248497, ...,  0.77716638,
         0.76535743,  0.11670681],
       [ 0.13308973, -0.12106689, -0.51353645, ...,  1.32546684,
         0.8276816 ,  1.25001549],
       ..., 
       [-0.25907157, -0.24458445, -0.87298188, ...,  0.6467455 ,
         0.43216921,  0.57972136],
       [ 1.23272918,  0.14475037,  0.16869452, ...,  0.27710557,
        -1.39863587, -0.10482702],
       [-0.57754589,  0.77061869,  1.88473625, ...,  0.31680682,
         1.64699058,  0.92152533]])
>>> j = np.random.choice(10, 5)
>>> i = np.random.choice(10,1000)
>>> X[i, :]
array([[-0.90775982,  0.82286474, -0.94136182, ...,  1.11494763,
         0.04252439,  1.08999938],
       [-2.51998203, -0.47154878, -0.88228892, ..., -0.03526119,
         0.40444398,  0.27545503],
       [-0.90775982,  0.82286474, -0.94136182, ...,  1.11494763,
         0.04252439,  1.08999938],
       ..., 
       [ 0.29236619, -1.53595325,  0.77567467, ...,  0.45090184,
         1.49180382,  1.04571078],
       [ 0.13308973, -0.12106689, -0.51353645, ...,  1.32546684,
         0.8276816 ,  1.25001549],
       [ 0.57790133, -1.11712824, -0.47716697, ...,  0.27169274,
        -0.84223531, -0.99293644]])
 >>> X[:, j]
array([[-0.51720414,  0.60436212,  0.54243319,  0.06349252, -0.19222932],
       [ 1.63248497, -0.75034999, -0.41102324,  0.88497758,  0.22215627],
       [-0.51353645,  0.74373642, -0.76499708,  0.13308973, -0.12106689],
       ..., 
       [-0.87298188, -0.14638175,  0.0278893 , -0.25907157, -0.24458445],
       [ 0.16869452, -0.42747292,  0.49202016,  1.23272918,  0.14475037],
       [ 1.88473625, -0.21566782, -0.52799588, -0.57754589,  0.77061869]])
>>> X[i, j]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: shape mismatch: objects cannot be broadcast to a single shape
>>> 

Upvotes: 2

Views: 4764

Answers (1)

grc
grc

Reputation: 23565

I think you'll have to index them separately:

X[i, :][:, j]

The problem is that NumPy tries to match each row index in i with a corresponding column index in j, but they are not the same length.

For example, X[(1, 2), (3, 4)] will select the elements X[1, 3] and X[2, 4], but X[(1, 2), (3, 4, 5)] will be mismatched.

Upvotes: 6

Related Questions