Reputation: 1781
I have two integer numpy arrays, let's say, arr1
and arr2
, that are permutations of range(some_length)
I want to get the third one, where
arr3[idx] = arr1.get_index_of(arr2[idx])
for all idx = 0,1,2,..., some_length-1
here get_index_of method
is a pseudo-method of getting index of some element in the collection.
That can be done with naive looping through all the indeces, searching correspondnent element with subsequent assignment of it's index, etc.
But that is slow -- O(n^2). Can it be done faster (At least n*log(n) complexity)? Can it be done via pretty numpy methods? Maybe some sorting with non-trivial key=
parameter? Sure there is some elegant solution.
Thank you in advance.
Upvotes: 1
Views: 57
Reputation: 231550
Lets try a test case
In [166]: arr1=np.random.permutation(10)
In [167]: arr2=np.random.permutation(10)
In [168]: arr1
Out[168]: array([4, 3, 2, 9, 7, 8, 5, 1, 6, 0])
In [169]: arr2
Out[169]: array([9, 2, 6, 4, 0, 3, 1, 7, 8, 5])
np.where(arr1==i)
performs your get_index_of method
, so your iterative solution is:
In [170]: np.array([np.where(arr1==i)[0] for i in arr2]).flatten()
Out[170]: array([3, 2, 8, 0, 9, 1, 7, 4, 5, 6], dtype=int32)
A vectorized approach is to do an 'outter' comparison between the 2 arrays. This produces a (10,10)
array, to which we can apply where
to get the indices. Still an O(n^2) method, but it is mostly compiled. On this size of a problem it is 5x faster.
In [171]: np.where(arr1==arr2[:,None])
Out[171]:
(array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9], dtype=int32),
array([3, 2, 8, 0, 9, 1, 7, 4, 5, 6], dtype=int32))
Upvotes: 1
Reputation: 77991
say, a
is a permutation of 0..9
:
>>> a = np.random.permutation(10)
>>> a
array([3, 7, 1, 8, 2, 4, 6, 0, 9, 5])
then, the indexer array is:
>>> i = np.empty(len(a), dtype='i8')
>>> i[a] = np.arange(len(a))
>>> i
array([7, 2, 4, 0, 5, 9, 6, 1, 3, 8])
this means that, index of say 0
in a
is i[0] == 7
, which is true since a[7] == 0
.
So, in your example, say if you have an extra vector b
, you can do as in below:
>>> b
array([5, 9, 4, 8, 6, 1, 7, 2, 3, 0])
>>> i[b]
array([9, 8, 5, 3, 6, 2, 1, 4, 0, 7])
which means that, say, b[0] == 5
and index of 5
in a
is i[b][0] == 9
, which is true, since a[9] = 5 = b[0]
.
Upvotes: 3