Reputation: 5470
I have a numpy array with some elements same as others i.e. there are ties, and I am applying np.argsort
to find the indices which will sort the array:
In [29]: x = [1, 2, 1, 1, 5, 2]
In [30]: np.argsort(x)
Out[30]: array([0, 2, 3, 1, 5, 4])
In [31]: np.argsort(x)
Out[31]: array([0, 2, 3, 1, 5, 4])
As can be seen here, the outputs we get by running argsort
two times are identical. However, array([2, 3, 0, 5, 1, 4])
is also a completely valid output because some elements in the original array are equal. Can I make argsort return me such "randomized" outputs when there are ties in my array? If not, what is a workaround because I don't want to bias my choice of the lowest values in the array when I am picking them.
Upvotes: 1
Views: 306
Reputation: 221624
One trick would be to add uniform noise in [0,1)
range and then perform argsort-ing. Adding such a noise forces sorting only within their respective bins and gives randomized sort indices restricted to those bins -
(x+np.random.rand(len(x))).argsort()
Upvotes: 3