Reputation: 5117
Let's suppose that I have a dataframe like that:
import pandas as pd
df = pd.DataFrame({'col1':['A','A', 'A', 'B','B'], 'col2':[2, 4, 6, 3, 4]})
I want to keep from it only the rows which have values at col2
which are less than the x-th quantile of the values for each of the groups of values of col1
separately.
For example for the 60-th percentile then the dataframe should look like that:
col1 col2
0 A 2
1 A 4
2 B 3
How can I do this efficiently in pandas
?
Upvotes: 1
Views: 581
Reputation: 323226
We have transform
with quantile
df[df.col2.lt(df.groupby('col1').col2.transform(lambda x : x.quantile(0.6)))]
col1 col2
0 A 2
1 A 4
3 B 3
Upvotes: 3