Filter pandas group with if else condition

Question

I have a pandas dataframe like this:

ID	Tier1	Tier2
1111	RF	B
1111	OK	B
2222	RF	B
2222	RF	E
3333	OK	B
3333	LO	B

I need to cut down the table so the IDs are unique, but do so with the following hierarchy: RF>OK>LO for Tier1. Then B>E for Tier2.

So the expected output will be:

ID	Tier1	Tier2
1111	RF	B
2222	RF	B
2222	RF	E
3333	OK	B

then:

ID	Tier1	Tier2
1111	RF	B
2222	RF	B
3333	OK	B

I am struggling to figure out how to this. My initial attempt is to group the table with grouped = df.groupby('ID') and then:

grouped = df.groupby('ID')
for key, group in grouped:
    check_rf = group['Tier1']=='RF'
    check_ok = group['Tier1']=='OK'
    if check_rf.any():
        group = group[group['Tier1']=='RF']
    elif check_ok.any():
        #and so on

I think this is working to filter each group, but I have no idea how the groups can then relate back to the parent table (df). And I am sure there is a better way to do this.

Thanks!

J&#233;r&#244;me Richard · Accepted Answer

You can use two groupby+agg Pandas calls. Since the ordering RF>OK>LO and B>E are respectively compliant the (reverse) lexicographic ordering, you can use the trivial min/max functions for the aggregation (otherwise you can write your own custom min-max functions).
Here is how to do that (using a 2-pass filtering):

tmp = df.groupby(['ID', 'Tier2']).agg(max).reset_index()      # Step 1
output = tmp.groupby(['ID', 'Tier1']).agg(min).reset_index()  # Step 2

Here is the result in output:

     ID Tier1 Tier2
0  1111    RF     B
1  2222    RF     B
2  3333    OK     B

Filter pandas group with if else condition

Answers (2)

Related Questions