Reputation: 589
I have an array with the following shape
(N, 2) Below is an example of the 2d array I have at hand:
[[0,2]
[0,3]
[1,2]
[1,3]
[1,4]]
I would like to get all the values in the second index that have duplicates. In the example above, I would like to have values 2 and 3 returned.
Is there a specific np function for this sort of task?
It seems like it's the opposite of np.unique but I have yet to find a working function for this problem.
Upvotes: 3
Views: 1670
Reputation: 88275
You could index on the second column and use np.bincount
to find the indices with counts higher than 1
:
a = np.array([[0,2],
[0,3],
[1,2],
[1,3],
[1,4]])
np.flatnonzero(np.bincount(a[:,1])>1)
# array([2, 3], dtype=int64)
Or for large integers, np.unique
will probably be a better option:
u, c = np.unique(a[:,1], return_counts=True)
u[c>1]
# array([2, 3])
Upvotes: 3
Reputation: 2704
You can use Counter from collections to perform this task.
z = np.array([[0,2],
[0,3],
[1,2],
[1,3],
[1,4]])
Now you can loop over desired index to check the duplicates.
from collections import Counter
dup = [item for item, count in Counter(z[:, 1]).items() if count > 1]
print(dup)
Out[12]: [2, 3]
Upvotes: 1
Reputation: 1545
You probably need something like:
arr = [[0,2],
[0,3],
[1,2],
[1,3],
[1,4]]
from collections import defaultdict
d = defaultdict(int)
for item in arr:
d[item[1]]+=1
for k, v in d.items():
if d[k] > 1:
print(k)
Upvotes: 1