Dylan B
Dylan B

Reputation: 173

Detect ordered pair in numpy array

I am using a numpy array to hold a list of ordered pairs (representing grid coordinates). The algorithm I am writing needs to check if a newly generated ordered pair is already in this array. Below is a schematic of the code:

cluster=np.array([[x1,y1]])
cluster=np.append(cluster,[[x2,y2]],axis=0)
cluster=np.append...etc.

new_spin=np.array([[x,y]])

if new_spin in cluster==False:
    do something

The problem with this current code is that it gives false positives. If x or y appear in the cluster, then new_spin in cluster evaluates as true. At first I thought a simple fix would be to ask if x and y appear in cluster, but this would not ensure that they appear as an ordered pair. To make sure they appear as an ordered pair I'd have to find the indices where x and y appear in cluster and compare them, which seems very clunky and inelegant, and I'm certain there must be a better solution out there. However, I have not been able to work it out myself.

Thanks for any help.

Upvotes: 1

Views: 1968

Answers (1)

unutbu
unutbu

Reputation: 880409

Let's work through an example:

In [7]: import numpy as np
In [8]: cluster = np.random.randint(10, size = (5,2))
In [9]: cluster
Out[9]: 
array([[9, 7],
       [7, 2],
       [8, 9],
       [1, 3],
       [3, 4]])

In [10]: new_spin = np.array([[1,2]])

In [11]: new_spin == cluster
Out[11]: 
array([[False, False],
       [False,  True],
       [False, False],
       [ True, False],
       [False, False]], dtype=bool)

new_spin == cluster is a numpy array of dtype bool. It is True where the value in cluster equals the corresponding value in new_spin.

For new_spin to be "in" cluster, a row of the above boolean array must all be True. We can find such rows by calling the all(axis = 1) method:

In [12]: (new_spin == cluster).all(axis = 1)
Out[12]: array([False, False, False, False, False], dtype=bool)

So new_spin is "in" cluster, if any of the rows is all True:

In [13]: 
In [14]: (new_spin == cluster).all(axis = 1).any()
Out[14]: False

By the way, np.append is a very slow operation -- slower than Python list.append. Chances are, you will get much better performance if you avoid np.append. If cluster is not too large, you may be better off making cluster a Python list of lists -- at least until you are done appending items. Then, if needed, convert cluster to a numpy array with cluster = np.array(cluster).

Upvotes: 4

Related Questions