Tasam Farkie
Tasam Farkie

Reputation: 329

How to vectorize this loop in Python loop requiring uniquesness and non-uniqueness check?

I have a numpy array (edges) of size 3xn and it contains exactly two rows where the first two elements of the row match and third element might or might not match. I am computing some stuff based on this criteria. My for loop based code goes along these lines

mat = np.zeros((edges.shape[0],2),dtype=np.int64)
counter = 0;
for i in range(edges.shape[0]):
    for j in range(i+1,edges.shape[0]):
        if  edges[i,0]==edges[j,0] and edges[i,1]==edges[j,1] and edges[i,2] != edges[j,2]:
            mat[2*counter,0] = i % nof
            mat[2*counter,1] = i // nof

            mat[2*counter+1,0] = j % nof
            mat[2*counter+1,1] = j // nof
            counter +=1
            break

where nof is a specific number. How can I speed-up this code using numpy? I cant use np.unique as this code requires uniqueness as well non-uniqueness check.

For example, given:

edges = np.array([
    [1,2,13],
    [4,5,15],
    [5,6,18],
    [1,2,12],
    [4,5,15],
    [5,6,18],
    ])

where the first two elements of each row can be found in another row (that is they are duplicated exactly twice) and nof=1, the above code gives the following result

[[0 0]
 [0 3]
 [0 0]
 [0 0]]

Upvotes: 1

Views: 66

Answers (1)

hpaulj
hpaulj

Reputation: 231605

I haven't absorbed how you are setting mat, but I suspect lexsorting on the first 2 columns might help:

In [512]: edges = np.array([
     ...:     [1,2,13],
     ...:     [4,5,15],
     ...:     [5,6,18],
     ...:     [1,2,12],
     ...:     [4,5,15],
     ...:     [5,6,18],
     ...:     ])
     ...:     
In [513]: np.lexsort((edges[:,1],edges[:,0]))
Out[513]: array([0, 3, 1, 4, 2, 5], dtype=int32)
In [514]: edges[_,:]   # sedges (below)
Out[514]: 
array([[ 1,  2, 13],
       [ 1,  2, 12],
       [ 4,  5, 15],
       [ 4,  5, 15],
       [ 5,  6, 18],
       [ 5,  6, 18]])

Now all matching rows are together.

If there always 2 matches, the pairs can be collected and reshaped into a 2 column array.

In [516]: sedges[:,2].reshape(-1,2)
Out[516]: 
array([[13, 12],
       [15, 15],
       [18, 18]])

Alternatively you could still iterate, but you don't have to check as far way.


argsort on a sorting list returns the reverse sort:

In [519]: np.lexsort((edges[:,1],edges[:,0]))
Out[519]: array([0, 3, 1, 4, 2, 5], dtype=int32)
In [520]: np.argsort(_)
Out[520]: array([0, 2, 4, 1, 3, 5], dtype=int32)
In [521]: sedges[_,:]
Out[521]: 
array([[ 1,  2, 13],
       [ 4,  5, 15],
       [ 5,  6, 18],
       [ 1,  2, 12],
       [ 4,  5, 15],
       [ 5,  6, 18]])

Upvotes: 2

Related Questions