newbie
newbie

Reputation: 644

Detect odd and even rows Python

I need to loop over a dataframe and check if the odd rows of a specific column are equal to a given variable( same for even rows).

This is my code:

mydf = pd.read_excel(test.xlsx, header=0, index= False)
mydf = mydf.sort_values(by='Time') 
if ((mydf['Door Name'].iloc[::2]=='RDC_OUT-1') & (mydf['Door Name'].iloc[1::2]=='RDC_IN-1')):
    for i in range (l):
        mydf['diff'] = mydf['Times'].iloc[1::2].to_numpy() - mydf['Times'].iloc[::2]
        Total = mydf['diff'].sum()
        print('Total: ',Total)

but when I run it I get this error:

if ((mydf['Door Name'].iloc[::2]=='RDC_OUT-1') & (mydf['Door Name'].iloc[1::2]=='RDC_IN-1')):

  File "C:\Users\khoul\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\generic.py", line 1478, in __nonzero__
    .format(self.__class__.__name__))

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

this is my dataframe:

enter image description here

 Door name    Time                   Last Name   First Name
 RDC_IN-1     05/08/2019  15:23:00   aa          bb
 RDC_OUT-1    05/08/2019  12:39:00   aa          bb           
 RDC_IN-1     05/08/2019  12:13:00   aa          bb
 RDC_OUT-1    05/08/2019  09:10:00   aa          bb

I don't know why it wouldn't accept it!

Upvotes: 1

Views: 2254

Answers (1)

jezrael
jezrael

Reputation: 863146

My idea is first select all sorted rows by Time with RDC_OUT-1 and next RDC_IN-1 values, another rows are filter out:

mydf = df.sort_values(by='Time') 

m1 = mydf['Door name'] == 'RDC_OUT-1'
m2 = mydf['Door name'] == 'RDC_IN-1'

m11 = m1 & m2.shift(-1)
m22 = m1.shift() & m2

df = mydf[m11 | m22]
print (mydf)
   Door name                Time Last Name First Name
3  RDC_OUT-1 2019-05-08 09:10:00        aa         bb
2   RDC_IN-1 2019-05-08 12:13:00        aa         bb
1  RDC_OUT-1 2019-05-08 12:39:00        aa         bb
0   RDC_IN-1 2019-05-08 15:23:00        aa         bb

So because get same number of IN and OUT rows solution from comments should working nice:

Total = (df.loc[df['Door name']=='RDC_IN-1','Time'] - 
         df.loc[df['Door name']=='RDC_OUT-1','Time'].to_numpy()).sum()
print('Total: ',Total)
Total:  0 days 05:47:00

Upvotes: 1

Related Questions