Reputation: 119
I have some machines and their sales in different groups e.g. machine age. My goal is to isolate the top 5% machines (for sales amount) within each group.
I managed to do this on the full dataframe using the following code
df['quantile'] = pd.qcut(df.Sales, 20, labels=range(20))
top = df[df['quantile']==0]
But I cannot seem to manage to do the same when using groupby. Essentially the assignment does not work. Any clue?
Upvotes: 2
Views: 58
Reputation: 150735
You can do groupby.transform
and quantile
:
df = pd.DataFrame({'a': np.random.randint(1,4,1000),
'Sales': np.random.uniform(0,1,1000)})
df['quantile'] = df.groupby('a').Sales.transform('quantile', 0.95)
#filter:
top = df[df.Sales.gt(df['quantile'])]
Upvotes: 2