W.Sun
W.Sun

Reputation: 668

Python Pandas - how to get top n values and the sum of all other values

I have a Pandas DataFrame like this:

Browsers        Sessions
Chrome          201
IE              136
Safari          101
Firefox         36
SamsungBrowse   12
Opera           6  

and what I need is display top 3 values and sum the rest as 'other':

Browsers        Sessions
Chrome          201
IE              136
Safari          101
Other           54  

Any ideas how this could be done?

Upvotes: 5

Views: 4902

Answers (2)

Mohammad Yusuf
Mohammad Yusuf

Reputation: 17054

There can be better ways to do this. But one way can be like this:

df2 = df.sort_values('Sessions', ascending=False)[:3]
s = df.sort_values('Sessions', ascending=False).Sessions[3:].sum()
df3.loc[len(df2)]=['Others', s]
print df3

Output:

  Browsers  Sessions
0   Chrome       201
1       IE       136
2   Safari       101
3   Others        54

Upvotes: 5

MaxU - stand with Ukraine
MaxU - stand with Ukraine

Reputation: 210832

Try this:

In [39]: result = df.nlargest(3, columns='Sessions')

In [40]: result.loc[len(result)] = ['Others', df.loc[~df.Browsers.isin(result.Browsers), 'Sessions'].sum()]

In [41]: result
Out[41]:
  Browsers  Sessions
0   Chrome       201
1       IE       136
2   Safari       101
3   Others        54

Upvotes: 8

Related Questions