Reputation: 603
I am using numpy to tally a lot of values across many large arrays, and keep track of which positions the maximum values appear in.
In particular, imagine I have a 'counts' array:
data = numpy.array([[ 5, 10, 3],
[ 6, 9, 12],
[13, 3, 9],
[ 9, 3, 1],
...
])
counts = numpy.zeros(data.shape, dtype=numpy.int)
data
is going to change a lot, but I want 'counts' to reflect the number of times the max has appeared in each position:
max_value_indices = numpy.argmax(data, axis=1)
# this is now [1, 2, 0, 0, ...] representing the positions of 10, 12, 13 and 9, respectively.
From what I understand of broadcasting in numpy, I should be able to say:
counts[max_value_indices] += 1
What I expect is the array to be updated:
[[0, 1, 0],
[0, 0, 1],
[1, 0, 0],
[1, 0, 0],
...
]
But instead this increments ALL the values in counts
giving me:
[[1, 1, 1],
[1, 1, 1],
[1, 1, 1],
[1, 1, 1],
...
]
I also though perhaps if I transformed max_value_indices to a 100x1 array, it might work:
counts[max_value_indices[:,numpy.newaxis]] += 1
but this has effect of updating just the elements in positions 0, 1, and 2:
[[1, 1, 1],
[1, 1, 1],
[1, 1, 1],
[0, 0, 0],
...
]
I'm also happy to turn the indices array into an array of 0's and 1's, and then add it to the counts
array each time, but I'm not sure how to construct that.
Upvotes: 2
Views: 721
Reputation: 879621
You could use so-called advanced integer indexing (aka Multidimensional list-of-locations indexing):
In [24]: counts[np.arange(data.shape[0]),
np.argmax(data, axis=1)] += 1
In [25]: counts
Out[25]:
array([[0, 1, 0],
[0, 0, 1],
[1, 0, 0],
[1, 0, 0]])
The first array, np.arange(data.shape[0])
specifies the row. The second array, np.argmax(data, axis=1)
specifies the column.
Upvotes: 2