brkcnbaz
brkcnbaz

Reputation: 29

How can I find indexes with same elements in 2d numpy array?

I'm working on a machine vision project. By reflecting laser light on the picture, I detect the pixels that the laser light falls on the picture with the help of Opencv. I keep these pixel values ​​as 2d numpy array. However, I want to make the x, y values ​​unique by determining the pixel values ​​whose x axis values ​​are the same and taking the average of them. Pixel values ​​are kept sequentially in numpy array.

For example:

[[659 253]
 [660 253]
 [660 256]
 [661 253]
 [662 253]
 [663 253]
 [664 253]
 [665 253]]

First of all, my goal is to identify all lists in which the first element of each list is the same. When using Opencv, pixel values ​​are kept in numpy arrays to be more useful. I'm trying to write an indexing method myself. I created a numpy array for myself to make it simpler.

x = np.array([[1, 2], [1, 78], [1, 3], [1, 6], [4, 3], [5, 6], [5, 3]], np.int32)

I followed a method like this to find the values ​​whose first element is the same from the lists in the x array.

for i in range (len (x)):
        if x [i]! = x [-1] and x [i] [0] == x [i + 1] [0]:
            print (x [i], x [i + 1])

I want to check if the first element in the first list is in the next lists by browsing the x array list. In order not to face an index out of range error, I used x [i]! = x [-1]. I was expecting this loop to return below result to me.

[1,2] [1,78]
[1,78] [1,3]
[1,3] [1,6]
[5,6] [5,3]

I would later remove duplicate elements from the list but I got

ValueError: The truth value of an array with more than one element is ambiguous.Use a.any() or a.all()

I am not familiar with numpy arrays so I could not get the solution I wanted. Is it possible to get the result I want using numpy array methods? Thanks for your time.

Upvotes: 0

Views: 1042

Answers (3)

Mad Physicist
Mad Physicist

Reputation: 114578

You can use np.unique with its return_inverse argument, which is effectively a sorting index, and return_counts, which is going to help build the split points:

_, ind, cnt = np.unique(x[:, 0], return_index=True, return_counts=True)

The index i arranges u into x. To sort the other way, you need to invert the index. Luckily, np.argsort is its own inverse:

ind = np.argsort(ind)

To get the splitpoints of the data, you can use np.cumsum on the count. You don't need the last element because it is always going to mark the end of the array:

spp = np.cumsum(cnt[:-1])

Finally, you can use np.split to get the list of sub-arrays that you want:

result = np.split(x[ind, :], spp, axis=0)

TL;DR

_, ind, cnt = np.unique(x[:, 0], return_index=True, return_counts=True)
np.split(x[np.argsort(ind), :], np.cumsum(cnt[:-1]), axis=0)

Upvotes: 0

mathfux
mathfux

Reputation: 5949

Approach 1

This is a numpy way to do this:

x_sorted = x[np.argsort(x[:,0])]
marker_idx = np.flatnonzero(np.diff(x_sorted[:,0]))+1
output = np.split(x_sorted, marker_idx)

Approach 2

You can also use a package numpy_indexed which is designed to solve groupby problems with less script and without loss of performance:

import numpy_indexed as npi
npi.group_by(x[:, 0]).split(x)

Approach 3

You can get groups of indices but this might not be the best option because of list comprehension:

import pandas as pd
[x[idx] for idx in pd.DataFrame(x).groupby([0]).indices.values()]

Output

[array([[  1,   2],
       [  1,  78],
       [  1,   3],
       [  1,   6],
       [  1, 234]]), 
array([[4, 3]]), 
array([[5, 6],
       [5, 3]])]

Upvotes: 2

IoaTzimas
IoaTzimas

Reputation: 10624

Try the following, using itertools.groupby:

x.sort(axis=0)
for l in [list([tuple(p) for p in k]) for i,k in itertools.groupby(x, key=lambda x: x[0])]:
    print(l)

Output:

[(1, 2), (1, 3), (1, 4), (1, 5), (1, 6)]
[(3, 6), (3, 78)]
[(5, 234)]

Upvotes: 0

Related Questions