Reputation: 10033
I have this df:
df = pd.DataFrame({
'Team': [
'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'X', 'Y', 'Z'
],
'Ranking': [
2, 6, 6, 1, 8, 9, 16, 6, 16, 8, 6, 3, 1, 16, 9, 1, 2, 1, 16, 16, 16, 9, 9, 8, 8
],
'Points': [
1, 1, 1, 1, 1, 2, 1, 1, 1, 2, 1, 1, 1, 1, 3, 3, 1, 1, 1, 1, 1, 1, 1, 1, 1
]
})
And I need to apply a filter to it using the following logic:
How can I apply the logic above to the dataframe?
Desired Result:
Teams /
A
B
C
D
E
F
G
H
I
J
K
L
M
N
O
P
Q
R
Y
Z
Upvotes: 0
Views: 55
Reputation: 1501
I've tried a solution, but the answer was quite different from the one you've written, probably due to a mistyping. Anyway, here it is a way of doing what you want.
To improve readability, I made a function to return the numbers of items to keep depending on their rank.
def items2keep(ranking):
return 4 if ranking < 5 else 3 if ranking < 13 else 2 if ranking < 17 else 1
result = []
for rank in df['Ranking'].unique():
_df = df[df['Ranking'] == rank].nlargest(
n=items2keep(rank),
columns=['Points']
)
result.append(_df)
final_df = pd.concat(result)
If we print final_df
we get:
print(final_df.sort_values(['Ranking', 'Points'], ascending=[True, False]))
Team Ranking Points
15 P 1 3
3 D 1 1
12 M 1 1
17 R 1 1
0 A 2 1
16 Q 2 1
11 L 3 1
1 B 6 1
2 C 6 1
7 H 6 1
9 J 8 2
4 E 8 1
23 Y 8 1
14 O 9 3
5 F 9 2
21 V 9 1
6 G 16 1
8 I 16 1
Upvotes: 0
Reputation: 150745
Let's use pd.cut
to map the Rankings to numbers of rows to extract, then compare those to the relative row numbers with groupby().cumcount()
:
thresh = pd.cut(df['Ranking'], bins=[0,4,12,16,20],
labels=[4,3,2,1]).astype(int)
df.loc[df.sort_values(['Points'])
.groupby('Ranking').cumcount().lt( thresh), 'Team']
Output:
0 A
1 B
2 C
3 D
4 E
5 F
7 H
11 L
12 M
15 P
16 Q
17 R
19 T
20 U
21 V
22 X
23 Y
24 Z
Name: Team, dtype: object
Upvotes: 1