Johnn-1231
Johnn-1231

Reputation: 125

How can I sort values in Python after grouping and aggregating them (with np.sum)?

So I am working with restaurant receipt data from a text file, and I have figured out how many of each item has been ordered. However, I now want to organize this is descending order, and it is not letting me. It is putting it in alphabetical order. After uploading the data set into Python, all I did was print it and then did grouping to put each menu item into its own group. After that, I aggregated it to get the totals for each menu item (how many total times it has been ordered). All my code is below.

import pandas as pd

import numpy as np

data = pd.read_csv('the file location', sep='\t')

df = pd.DataFrame(data)

grouped=df.groupby('item_name') #item_name is the variable I am interested in from data set. It is the name of 
                                each menu item from the receipt

print (grouped['item_name'].agg(np.size)) #aggregating the menu items to see how many of each there are

After this, I get the output of the total count of how many times each menu item has been ordered, but the numbers are not in descending order. They are in alphabetical order based on the name of the item. I want the counts to be listed in descending numerical order (highest number at the top) Please help!

Upvotes: 0

Views: 437

Answers (2)

Artem Kiselev
Artem Kiselev

Reputation: 66

The simplest solution is to add one more field to your dataframe, fill it with 1 values and then sum them:

import pandas as pd


df = pd.read_csv('the file location', sep='\t')

df['items_count'] = 1

grouped = df.groupby(by='item_name').sum()

print(grouped.sort_values(by='items_count', ascending=False))

P.S. Also pd.read_csv returns DataFrame, you don't need to pass it to pd.DataFrame again.

Upvotes: 0

Sy Ker
Sy Ker

Reputation: 2190

If I understand your question correctly, you are trying to calculate how many of each items have been ordered ?

import pandas 

df = pd.read_csv('the file location', sep='\t')

# value_counts already sorts in ascending order
df['item_name'].value_counts()

# other option
df.groupby('item_name').size().sort_values(ascending=False)

Upvotes: 1

Related Questions