Nico Schlömer
Nico Schlömer

Reputation: 58791

given permuted arrays, find permutation

I have two numpy integer arrays,

import numpy

a = numpy.array([1, 3, 5, 0])
b = numpy.array([3, 5, 0, 1])

which I know are permutations of each other. How can I find the permutation, i.e., the integer array i such that

a[i] == b

? Explicit for loops with comparisons across the entire arrays works, but seems inefficient.

Bonus points if it works of permutation of row-arrays like

import numpy

a = numpy.array([
   [1, 2],
   [3, 7],
   [5, 12],
   [0, 4],
   # ...
])
b = numpy.array([
   [3, 7],
   [5, 12],
   [0, 4], 
   [1, 2],
   # ...
])

Upvotes: 1

Views: 192

Answers (2)

Divakar
Divakar

Reputation: 221584

Here's one with np.searchsorted on 1D-views for 2D arrays -

# https://stackoverflow.com/a/45313353/ @Divakar
def view1D(a, b): # a, b are arrays
    a = np.ascontiguousarray(a)
    b = np.ascontiguousarray(b)
    void_dt = np.dtype((np.void, a.dtype.itemsize * a.shape[1]))
    return a.view(void_dt).ravel(),  b.view(void_dt).ravel()

A,B = view1D(a,b)
sidx = A.argsort()
idx = sidx[np.searchsorted(A,B,sorter=sidx)]

For 1D input arrays, we can directly feed in a and b as A and B respectively.

Alternatively, for positive integers, we can use dimensionality-reduction with grid-mapping, keeping rest of it same as earlier -

s = np.r_[1,a[:,:-1].max(0)+1]
A,B = a.dot(s),b.dot(s)

Or use np.ravel_multi_index to do the mapping on 2D-grid -

shp = a.max(0)+1
A,B = np.ravel_multi_index(a.T,shp),np.ravel_multi_index(b.T,shp)

Another for positive integers with sparse-matrix using the same grid-mapping technique -

from scipy.sparse import csr_matrix

R = np.arange(1,len(a)+1)
c = csr_matrix((R, a.T), shape=a.max(0)+1)
idx = c[b[:,0],b[:,1]].A1-1

Upvotes: 3

Paul Panzer
Paul Panzer

Reputation: 53029

Here's one using argsort twice. It seems a few percent faster than @Divakar's:

enter image description here

from simple_benchmark import BenchmarkBuilder, MultiArgument
import numpy as np

B = BenchmarkBuilder()

@B.add_function()
def div(A,B):
    sidx = A.argsort()
    return sidx[np.searchsorted(A,B,sorter=sidx)]

@B.add_function()
def pp(A,B):
    oa,ob = (x.argsort() for x in (A,B))
    o = np.empty_like(oa)
    o[ob] = oa
    return o

@B.add_arguments('array size')
def argument_provider():
    for exp in range(8, 30):
        dim_size = int(1.4**exp)
        a = np.random.permutation(dim_size)
        b = np.random.permutation(dim_size)
        yield dim_size, MultiArgument([a,b])

r = B.run()
r.plot()

import pylab
pylab.savefig('bm.png')

Upvotes: 3

Related Questions