Reputation: 43
I'm writing a code in Python and I'm having a few problems. I have two arrays, let's say A and B, both of them containing IDs. A has all IDs, and B has IDs belonging to a group. What I'm trying to do is to get the positions of the elements of B in A using the code:
>>> print B
[11600813 11600877 11600941 ..., 13432165 13432229 13434277]
>>> mask=np.nonzero(np.in1d(A, B))
>>> print A[mask]
[12966245 12993389 12665837 ..., 13091877 12965029 13091813]
But this is clearly wrong, since I'm not recovering the values of B. Checking if I was using numpy.in1d()
correctly, I tried:
>>> mask=np.nonzero(np.in1d(A, B[0]))
>>> print A[mask]
[11600813]
which is right, so I'm guessing there is a problem with 'B' in numpy.in1d()
. I tried using the boolean np.in1d(A, B)
directly instead of converting it to indices but it didn't work. I also tried using B = numpy.array(B)
, B = list(B)
, and none of them worked.
But if I do B = numpy.array(B)[0]
, B = list(B)[0]
it still works for that element. Unfortunately I can't do a 'for' cycle for each element because len(A)
is 16777216 and len(B)
is 9166 so it takes a lot of time.
I also made sure that all elements of B are in A:
>>> np.intersect1d(A, B)
[11600813 11600877 11600941 ..., 13432165 13432229 13434277]
Upvotes: 1
Views: 1647
Reputation: 97331
You can use numpy.argsort
, numpy.searchsorted
to get the positions:
import numpy as np
A = np.unique(np.random.randint(0, 100, 100))
B = np.random.choice(A, 10)
idxA = np.argsort(A)
sortedA = A[idxA]
idxB = np.searchsorted(sortedA, B)
pos = idxA[idxB]
print A[pos]
print B
If you want faster method, consider using pandas.
import pandas as pd
s = pd.Index(A)
pos = s.get_indexer(B)
print A[pos]
print B
Upvotes: 2