Reputation: 3083
How can I find the first index of a value in each row of a 2D array, using vectorized numpy functions?
For example, given
I = numpy.array([1,1,1]
M = numpy.array([[1,2,3],[2,3,1],[3,1,2]])
The output should be:
array([0, 2, 1])
I can do it with a list comprehension like this:
[ numpy.where(M[i] == I[i])[0][0] for i in range(0, len(I)) ]
What would the numpy equivalent be?
Upvotes: 4
Views: 1216
Reputation: 14377
A possibility of exploiting vectorization is as follows
coords = ((I[:, np.newaxis] == M) * np.arange(M.shape[1], 0, -1)[np.newaxis, :]).argmax(1)
any = (I[:, np.newaxis] == M).any(1)
coords = coords[any]
It disambiguates between several occurrences of the value of interest in the same line by multiplying a decreasing counter to each line, making the first occurence have the highest value. If a given line does not contain the indicated value, then it is removed from coords
. The remaining lines (in which the corresponding value was found) are indexed by any
Upvotes: 2
Reputation: 54330
I think these might do it, step by step:
In [52]:
I = np.array([1,1,1])
#M = np.array([[1,2,3],[2,3,1],[3,1,2]])
M = np.array([[4,2,3],[2,3,4],[3,4,2]])
In [53]:
I1=I.reshape((-1,1))
In [54]:
M1=np.hstack((M, I1))
In [55]:
np.apply_along_axis(np.argmax, 1, (M1-I1)==0)
Out[55]:
array([3, 3, 3])
If the number is not found in M
, the resulting index is M.shape[1]
. Since the result is an array
of int
, put a nan
in those cells is not an option. But we may consider put -1
for those cases, if the result is result
:
result[result==(M.shape[1])]=-1
Upvotes: 1