Andy Wang
Andy Wang

Reputation: 35

Can't check if any item inside an array is also in another dataframe

I've created a dataframe c and array b. I've successfully checked that if any of number in the first column of c is also in b.

a = np.array([
                [2.,1.,1.],
                [3.,4.,1.],
                [5.,6.,1.],
                [7.,8.,1.]])

c = pd.DataFrame(data=a,
                 dtype = 'float64')


b = np.array(([1, 10, 5, 2]), dtype = 'float64')

for i in range(len(c)):
    if c.iloc[i,0] in b:
        print ("Great")
    else:
        print ('sad')

output:

Great
sad
Great
sad

However, the following doesn't work when checking if any of the item in b is in c dataframe. Why is that?

for i in range(len(b)):
    if b[i,0] in c:
        print ('hola')
    else:
        print ('False')

Upvotes: 1

Views: 61

Answers (1)

jezrael
jezrael

Reputation: 862521

I think better is avoid loops, because slow. So for check columns by array use Series.isin:

mask1 = c[0].isin(b)
print (mask1 )
0     True
1    False
2     True
3    False
Name: 0, dtype: bool

d1 = np.where(mask1, 'Great', 'sad')
print (d1)
['Great' 'sad' 'Great' 'sad']

And for check all values use DataFrame.isin with any for check if at least one True per row in boolean DataFrame:

mask2 = c.isin(b).any(axis=1)
print (mask2)
0    True
1    True
2    True
3    True
dtype: bool

e1 = np.where(mask2, 'hola', 'False')
print (e1)
['hola' 'hola' 'hola' 'hola']

Detail:

print (c.isin(b))
       0      1     2
0   True   True  True
1  False  False  True
2   True  False  True
3  False  False  True

If want check b in c use numpy.in1d with flatening values of DataFrame to 1d array by numpy.ravel:

mask3 = np.in1d(b, c.values.ravel())
print (mask3)
[ True False  True  True]

Upvotes: 2

Related Questions