Reputation: 117
I have a pandas dataframe as follows:
df = pd.DataFrame()
df['Name'] = ['Abby', 'Abby', 'Abby', 'Abby', 'Abby', 'Daniel', 'Daniel', 'Daniel', 'Daniel', 'Daniel']
df['Marks'] = [100, 90, 76, 50, 10, 50, 45, 38, 25, 5]
I want to:
So, I have found the 40th percentile for each group using:
df.groupby('Name').quantile(0.4)
The Aim is to get to:
My main issue is that the values for each group are not standardized and so I cannot apply an overall percentile value for the entire dataset.
But all the help I saw regarding filtering a dataframe with a certain value does not do it separately for each group. I have seen the following questions:
Pandas, groupby where column value is greater than x
Pandas Groupby apply function to count values greater than zero
My question essentially builds on a variation of the following question: Calculate Arbitrary Percentile on Pandas GroupBy
Is there a way to do this in Pandas?
Upvotes: 5
Views: 3938
Reputation: 59274
Using your code for percentiles, and loc
, ge
for >=
(or gt
, for >
) and index matching:
df = df.set_index('Name')
df.loc[df.Marks.ge(df.groupby('Name').quantile(0.4).Marks)]
Name Marks
0 Abby 100
1 Abby 90
2 Abby 76
5 Daniel 50
6 Daniel 45
7 Daniel 38
Upvotes: 1
Reputation: 323226
You can using transform
df[df.Marks>df.groupby('Name').Marks.transform('quantile',0.4)]
Out[712]:
Name Marks
0 Abby 100
1 Abby 90
2 Abby 76
5 Daniel 50
6 Daniel 45
7 Daniel 38
Upvotes: 4