Reputation: 31
I am using the below code to first group my data so that I can get total Sales Quantity for a Material in a given Territory and in a given month.
Material_Wise = data.groupby(['Material','Territory Name','Month'])['Gross Sales Qty'].sum()
print(Material_Wise)
Material Territory Name Month
A Region 1 Apr 2017 40000.0
Aug 2017 12000.0
Dec 2017 12000.0
Feb 2018 50000.0
Jan 2017 50000.0
...
E Region 2 Nov 2019 9000.0
Oct 2018 2000.0
Oct 2019 22900.0
Sept 2018 10000.0
Sept 2019 14200.0
Above is the output I am getting, Now I want to sort my data such that, I can get an output as shown below:
Material Territory Name Month
A Region 1 Jan 2017 50000.0
Apr 2017 40000.0
Aug 2017 12000.0
Dec 2017 12000.0
Feb 2018 50000.0
...
E Region 2 Sept 2018 10000.0
Oct 2018 2000.0
Sept 2019 14200.0
Oct 2019 22900.0
Nov 2019 9000.0
Upvotes: 3
Views: 52
Reputation: 93181
Since your Month
column is of string data typem, the default ordering behavior is to sort alphabetically. To sort it semantically, you need to convert it to ordered categorical type.
# Convert the months from strings to Timestamps (Apr 2017 -> 2017-01-01), drop the duplicates,
# sort them, and convert them back to strings again.
# The result is a series of semantically-ordered month names
month_names = pd.to_datetime(data['Month']).drop_duplicates().sort_values().dt.strftime('%b %Y')
# Create ordered category of month names
MonthNameDType = pd.api.types.CategoricalDtype(month_names, ordered=True)
# This will appear the same after the conversion. To check, you can use `data.dtypes` before
# and after
data['Month'] = data['Month'].astype(MonthNameDType)
# And groupby as usual
Material_Wise = data.groupby(['Material','Territory Name','Month'], observed=True)['Gross Sales Qty'].sum()
Upvotes: 1