Reputation: 67

Cross-reference between numpy arrays

I have a 1d array of ids, for example:

a = [1, 3, 4, 7, 9]

Then another 2d array:

b = [[1, 4, 7, 9], [3, 7, 9, 1]]

I would like to have a third array with the same shape of b where each item is the index of the corresponding item from a, that is:

c = [[0, 2, 3, 4], [1, 3, 4, 0]]

What's a vectorized way to do that using numpy?

Upvotes: 6

Answers (3)

waykiki

Reputation: 1094

Effectively, this solution is a one-liner. The only catch is that you need to reshape the array before you do the one-liner, and then reshape it back again:

import numpy as np

a = np.array([1, 3, 4, 7, 9])
b = np.array([[1, 4, 7, 9], [3, 7, 9, 1]])
original_shape = b.shape

c = np.where(b.reshape(b.size, 1) == a)[1]

c = c.reshape(original_shape)

This results with:

[[0 2 3 4]
 [1 3 4 0]]

Upvotes: 1

Ahmed AEK

Reputation: 17616

this may not make sense but ... you can use np.interp to do that ...

a = [1, 3, 4, 7, 9]
sorting = np.argsort(a)
positions = np.arange(0,len(a))
xp = np.array(a)[sorting]
fp = positions[sorting]
b = [[1, 4, 7, 9], [3, 7, 9, 1]]
c = np.rint(np.interp(b,xp,fp)) # rint is better than astype(int) because floats are tricky.
# but astype(int) should work faster for small len(a) but not recommended.

this should work as long as the len(a) is smaller than the largest representable int by float (16,777,217) .... and this algorithm is of O(n*log(n)) speed, (or rather len(b)*log(len(a)) to be precise)

Upvotes: 2

user17242583

Reputation:

Broadcasting to the rescue!

>>> ((np.arange(1, len(a) + 1)[:, None, None]) * (a[:, None, None] == b)).sum(axis=0) - 1
array([[0, 2, 3, 4],
       [1, 3, 4, 0]])

Upvotes: 0

Cross-reference between numpy arrays

Answers (3)

Related Questions