Code to find top 95 percent of column values in dataframe

Question

I am looking for help gathering the top 95 percent of sales in a Pandas Data frame where I need to group by a category column. I found the following (top section of code) which is close. TotalDollars in my df gets properly sorted in descending fashion, but the resulting number of rows includes more than top 95% of total dollars.

Total Dollars     Percent     Running Percent

117388     11.09%    11.09%

81632     7.71%     18.80%

46316     4.38%     23.18%

41500     3.92%     27.10%

after hitting 95% running total percent want to eliminate remaining rows for that category. I don't need Percent or Running Percent as df fields (given for illustrative purposes only).

df1 = (df.groupby('channel',group_keys=False)
        .apply(lambda x: x.nlargest(int(len(x) * a), 'score')))

my code:

df_out = (df_Sales.groupby('category', group_keys=False).apply(lambda x: x.nlargest(int(len(x) * 0.95), 'TotalDollars')))

Code to find top 95 percent of column values in dataframe

Answers (1)

Related Questions