numpy in1d returning incorrect results?

Question

I have a weird issue happening with numpy's in1d function. I have two arrays of integer values representing particle IDs say A & B (IDs are unique to each particle). Array A contains a list of all the particles, and array B contains a list of all particles that belong to a group (all particles in B are also in A). What I'm trying to get at is the index of all grouped particles in array A, but for some reason numpy's in1d is not returning the correct results. Here is an example:

A = all particle IDs (length of 54480)
B = all grouped particle IDs (length of 48061)

A brute force search shows that all particle IDs within B do reside in A. I can also do:

matches = np.in1d(B,A)
print len(np.where(matches==True)[0])
>> 48061

to verify that that all elements of B are present in A. Now the odd part is if I do

matches = np.in1d(A,B)
print len(np.where(matches==True)[0])
>> 35590

I get something unexpected. Shouldn't this return 48061 True and 6419 False? I've uploaded A.txt and B.txt to my dropbox if anyone wants to mess with this dataset (~300K each). Thanks in advance for any help you can provide!

edit: I should also mention that I need the returned bool array to be unsorted so numpy's intersect is out of the question.

numpy in1d returning incorrect results?

Answers (1)

Related Questions