Python return the first index where multiple columns contain a desired value

Question

I have the following sample data frame:

df = pd.DataFrame({
    'seq':[0,1,2,3,4,5,6,7,8,9,10,11],
    'flag1':[np.nan,np.nan,1,1,1, 0,-1,-1,1,1,1,0],
    'flag2':[np.nan,np.nan,np.nan,0, 0,0,-1,-1,0,1, 0,1]
})

I am trying to get the index of the first row where both flag1 and flag2 values are 1. In the above case, it would be 9.

I tried df[df.flag1 == 1.0 & df.flag2 == 1.0].index[0] but it returns me an error. Similarly, df[df.flag1 == 1.0] & df[df.flag2 == 1.0].index[0] does not work either. I tried searching on SO but could not find a solution for my specific need.

Dave Costa · Accepted Answer

In this expression:

df.flag1 == 1.0 & df.flag2 == 1.0

the & operator has greatest precedence, so it is actually interpreted as:

df.flag1 == (1.0 & df.flag2) == 1.0

which is entirely not what you meant, and in this case produces an error.

Add parentheses to force the evaluation order you want:

(df.flag1 == 1.0) & (df.flag2 == 1.0)

With this change, your initial approach should work fine.

Python return the first index where multiple columns contain a desired value

Answers (1)

Related Questions