jds
jds

Reputation: 8259

How to get a value from every column in a Numpy matrix

I'd like to get the index of a value for every column in a matrix M. For example:

M = matrix([[0, 1, 0],
            [4, 2, 4],
            [3, 4, 1],
            [1, 3, 2],
            [2, 0, 3]])

In pseudocode, I'd like to do something like this:

for col in M:
    idx = numpy.where(M[col]==0) # Only for columns!

and have idx be 0, 4, 0 for each column.

I have tried to use where, but I don't understand the return value, which is a tuple of matrices.

Upvotes: 5

Views: 1233

Answers (5)

senderle
senderle

Reputation: 150957

The tuple of matrices is a collection of items suited for indexing. The output will have the shape of the indexing matrices (or arrays), and each item in the output will be selected from the original array using the first array as the index of the first dimension, the second as the index of the second dimension, and so on. In other words, this:

>>> numpy.where(M == 0)
(matrix([[0, 0, 4]]), matrix([[0, 2, 1]]))
>>> row, col = numpy.where(M == 0)
>>> M[row, col]
matrix([[0, 0, 0]])
>>> M[numpy.where(M == 0)] = 1000
>>> M
matrix([[1000,    1, 1000],
        [   4,    2,    4],
        [   3,    4,    1],
        [   1,    3,    2],
        [   2, 1000,    3]])

The sequence may be what's confusing you. It proceeds in flattened order -- so M[0,2] appears second, not third. If you need to reorder them, you could do this:

>>> row[0,col.argsort()]
matrix([[0, 4, 0]])

You also might be better off using arrays instead of matrices. That way you can manipulate the shape of the arrays, which is often useful! Also note ajcr's transpose-based trick, which is probably preferable to using argsort.

Finally, there is also a nonzero method that does the same thing as where in this case. Using the transpose trick now:

>>> (M == 0).T.nonzero()
(matrix([[0, 1, 2]]), matrix([[0, 4, 0]]))

Upvotes: 3

Alex Riley
Alex Riley

Reputation: 176750

As an alternative to np.where, you could perhaps use np.argwhere to return an array of indexes where the array meets the condition:

>>> np.argwhere(M == 0)
array([[[0, 0]],

       [[0, 2]],

       [[4, 1]]])

This tells you each the indexes in the format [row, column] where the condition was met.

If you'd prefer the format of this output array to be grouped by column rather than row, (that is, [column, row]), just use the method on the transpose of the array:

>>> np.argwhere(M.T == 0).squeeze()
array([[0, 0],
       [1, 4],
       [2, 0]])

I also used np.squeeze here to get rid of axis 1, so that we are left with a 2D array. The sequence you want is the second column, i.e. np.argwhere(M.T == 0).squeeze()[:, 1].

Upvotes: 2

Irshad Bhat
Irshad Bhat

Reputation: 8709

>>> M = np.array([[0, 1, 0],
...             [4, 2, 4],
...             [3, 4, 1],
...             [1, 3, 2],
...             [2, 0, 3]])
>>> [np.where(M[:,i]==0)[0][0] for i in range(M.shape[1])]
[0, 4, 0]

Upvotes: 0

xnx
xnx

Reputation: 25478

This isn't anything new on what's been already suggested, but a one-line solution is:

>>> np.where(np.array(M.T)==0)[-1]
array([0, 4, 0])

(I agree that NumPy matrix objects are more trouble than they're worth).

Upvotes: 0

Akavall
Akavall

Reputation: 86168

The result of where(M == 0) would look something like this

(matrix([[0, 0, 4]]), matrix([[0, 2, 1]])) First matrix tells you the rows where 0s are and second matrix tells you the columns where 0s are.

Out[4]: 
matrix([[0, 1, 0],
        [4, 2, 4],
        [3, 4, 1],
        [1, 3, 2],
        [2, 0, 3]])

In [5]: np.where(M == 0)
Out[5]: (matrix([[0, 0, 4]]), matrix([[0, 2, 1]]))

In [6]: M[0,0] 
Out[6]: 0

In [7]: M[0,2] #0th row 2nd column
Out[7]: 0

In [8]: M[4,1] #4th row 1st column
Out[8]: 0

Upvotes: 0

Related Questions