Reputation: 58791
I have two numpy integer arrays,
import numpy
a = numpy.array([1, 3, 5, 0])
b = numpy.array([3, 5, 0, 1])
which I know are permutations of each other. How can I find the permutation, i.e., the integer array i
such that
a[i] == b
? Explicit for
loops with comparisons across the entire arrays works, but seems inefficient.
Bonus points if it works of permutation of row-arrays like
import numpy
a = numpy.array([
[1, 2],
[3, 7],
[5, 12],
[0, 4],
# ...
])
b = numpy.array([
[3, 7],
[5, 12],
[0, 4],
[1, 2],
# ...
])
Upvotes: 1
Views: 192
Reputation: 221584
Here's one with np.searchsorted
on 1D
-views for 2D
arrays -
# https://stackoverflow.com/a/45313353/ @Divakar
def view1D(a, b): # a, b are arrays
a = np.ascontiguousarray(a)
b = np.ascontiguousarray(b)
void_dt = np.dtype((np.void, a.dtype.itemsize * a.shape[1]))
return a.view(void_dt).ravel(), b.view(void_dt).ravel()
A,B = view1D(a,b)
sidx = A.argsort()
idx = sidx[np.searchsorted(A,B,sorter=sidx)]
For 1D
input arrays, we can directly feed in a
and b
as A
and B
respectively.
Alternatively, for positive integers, we can use dimensionality-reduction with grid-mapping
, keeping rest of it same as earlier -
s = np.r_[1,a[:,:-1].max(0)+1]
A,B = a.dot(s),b.dot(s)
Or use np.ravel_multi_index
to do the mapping on 2D-grid -
shp = a.max(0)+1
A,B = np.ravel_multi_index(a.T,shp),np.ravel_multi_index(b.T,shp)
Another for positive integers with sparse-matrix
using the same grid-mapping
technique -
from scipy.sparse import csr_matrix
R = np.arange(1,len(a)+1)
c = csr_matrix((R, a.T), shape=a.max(0)+1)
idx = c[b[:,0],b[:,1]].A1-1
Upvotes: 3
Reputation: 53029
Here's one using argsort twice. It seems a few percent faster than @Divakar's:
from simple_benchmark import BenchmarkBuilder, MultiArgument
import numpy as np
B = BenchmarkBuilder()
@B.add_function()
def div(A,B):
sidx = A.argsort()
return sidx[np.searchsorted(A,B,sorter=sidx)]
@B.add_function()
def pp(A,B):
oa,ob = (x.argsort() for x in (A,B))
o = np.empty_like(oa)
o[ob] = oa
return o
@B.add_arguments('array size')
def argument_provider():
for exp in range(8, 30):
dim_size = int(1.4**exp)
a = np.random.permutation(dim_size)
b = np.random.permutation(dim_size)
yield dim_size, MultiArgument([a,b])
r = B.run()
r.plot()
import pylab
pylab.savefig('bm.png')
Upvotes: 3