Python: pandas dataframe comparison of rows with the same value in one column

Question

I have a dataframe which looks like:

id    name    num_1   num_2
1     A       12      14
1     A       15
2     B       10      9  
3     C       19      18
3     C       16

My desired output would be:

id    name    num_1   num_2
1     A       12      14
1     A       15

Basically I want rows with the same id where num_1 of the second row is greater than num_2 of the first row. Dataframe is sorted by id and num_1. There might be ids where I only have one row for them and should be excluded from the final dataframe. I know I can iterate through the dataframes to get what I am looking for but am wondering if there is a better way of doing this. I have also tried to use shif to make it work but it gives me the incorrect results:

id    name    num_1   num_2
1     A       15
2     B       10      9  
3     C       19      18

Thanks

Mr Tarsa · Accepted Answer

Try to use groupby with filter

df.groupby('name').filter(
     lambda x: len(x) > 1 and x['num_1'].iloc[1] > x['num_2'].iloc[0])

Python: pandas dataframe comparison of rows with the same value in one column

Answers (1)

Related Questions