Reputation: 125
So I am working with restaurant receipt data from a text file, and I have figured out how many of each item has been ordered. However, I now want to organize this is descending order, and it is not letting me. It is putting it in alphabetical order. After uploading the data set into Python, all I did was print it and then did grouping to put each menu item into its own group. After that, I aggregated it to get the totals for each menu item (how many total times it has been ordered). All my code is below.
import pandas as pd
import numpy as np
data = pd.read_csv('the file location', sep='\t')
df = pd.DataFrame(data)
grouped=df.groupby('item_name') #item_name is the variable I am interested in from data set. It is the name of
each menu item from the receipt
print (grouped['item_name'].agg(np.size)) #aggregating the menu items to see how many of each there are
After this, I get the output of the total count of how many times each menu item has been ordered, but the numbers are not in descending order. They are in alphabetical order based on the name of the item. I want the counts to be listed in descending numerical order (highest number at the top) Please help!
Upvotes: 0
Views: 437
Reputation: 66
The simplest solution is to add one more field to your dataframe, fill it with 1
values and then sum them:
import pandas as pd
df = pd.read_csv('the file location', sep='\t')
df['items_count'] = 1
grouped = df.groupby(by='item_name').sum()
print(grouped.sort_values(by='items_count', ascending=False))
P.S. Also pd.read_csv returns DataFrame, you don't need to pass it to pd.DataFrame again.
Upvotes: 0
Reputation: 2190
If I understand your question correctly, you are trying to calculate how many of each items have been ordered ?
import pandas
df = pd.read_csv('the file location', sep='\t')
# value_counts already sorts in ascending order
df['item_name'].value_counts()
# other option
df.groupby('item_name').size().sort_values(ascending=False)
Upvotes: 1