Reputation: 1106
Whats a good way to get combinations of indices that points to unique elements in array. For example a = [1,1,3,2]
, the possible set of pointers would be {0,2,3}, {1,2,3}
.
I can use argsort
in combination with splitting the elements by frequency to then use something like itertools.product
to get all sets of indices I want.
This is what I tried:
from numpy import array, split
from scipy.stats import itemfreq
from itertools import product
a = array([1,1,3,2])
fq = itemfreq(a)[:,1]
fq = [int(f + sum(fq[:i])) for i, f in enumerate(fq)]
print list(product(*(ptrs for ptrs in split(a.argsort(), fq) if len(ptrs))))
#> [(0, 3, 2), (1, 3, 2)]
How can I do this better?
Upvotes: 1
Views: 291
Reputation: 67427
@atomh33ls's answer can be vectorized as follows.
First, extract the inverse indices and counts of each unique item. If you are using numpy >= 1.9:
_, idx, cnt = np.unique(a, return_inverse=True, return_counts=True)
In older versions, this does the same:
_, idx = np.unique(a, return_inverse=True)
cnt = np.bincount(idx)
And now, a little bit of magic and, voila:
>>> np.split(np.arange(len(a))[np.argsort(idx)], np.cumsum(cnt)[:-1])
[array([0, 1]), array([3]), array([2])]
Upvotes: 1
Reputation: 31050
This does get you the indices, but possibly not in the format you want:
[np.where(a==x) for x in np.unique(a)]
[(array([0, 1]),), (array([3]),), (array([2]),)]
I imagine there is a better way, without the for loop.
Upvotes: 3