Apply filter to dataframe based on conditions

Question

I have this df:

df = pd.DataFrame({
  'Team': [
    'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'X', 'Y', 'Z'
  ],
  'Ranking': [
    2, 6, 6, 1, 8, 9, 16, 6, 16, 8, 6, 3, 1, 16, 9, 1, 2, 1, 16, 16, 16, 9, 9, 8, 8
  ],
  'Points': [
    1, 1, 1, 1, 1, 2, 1, 1, 1, 2, 1, 1, 1, 1, 3, 3, 1, 1, 1, 1, 1, 1, 1, 1, 1
  ]
})

And I need to apply a filter to it using the following logic:

If the team is ranked 1-4, keep maximum of 4 items.
If the team is ranked 5-12, keep the maximum of 3 items
If the team is ranked 12-16, keep the maximum of 2 items
if 17-20, keep max of 1 item
When dropping lines the items witch exceed the quota, drop the ones with less points.

How can I apply the logic above to the dataframe?

Desired Result:

Teams /
A
B
C
D
E
F
G
H
I
J
K
L
M
N
O
P
Q
R
Y
Z

Quang Hoang · Accepted Answer

Let's use pd.cut to map the Rankings to numbers of rows to extract, then compare those to the relative row numbers with groupby().cumcount() :

thresh = pd.cut(df['Ranking'], bins=[0,4,12,16,20], 
                labels=[4,3,2,1]).astype(int)

df.loc[df.sort_values(['Points'])
         .groupby('Ranking').cumcount().lt( thresh), 'Team']

Output:

0     A
1     B
2     C
3     D
4     E
5     F
7     H
11    L
12    M
15    P
16    Q
17    R
19    T
20    U
21    V
22    X
23    Y
24    Z
Name: Team, dtype: object

Apply filter to dataframe based on conditions

Answers (2)

Related Questions