Reputation: 55
I have the following code but I want it to be sort by Month instead of totals. The date format is YYYY-MM-DD
df.groupby(df['date'].dt.strftime('%B'))['total'].sum().sort_values().plot.bar(figsize=(20,10))
Upvotes: 0
Views: 2235
Reputation: 150735
The month names are strings, and Python doesn't know how to relate those with the actual order. You can try reindex
:
# you need to type this in
months = ['January', 'February', 'March', 'April',
'May', 'June', 'July', 'August',
'September', 'October', 'November', 'December']
(df.groupby(df['date'].dt.strftime('%B'))
['total'].sum()
.reindex(months)
.plot.bar(figsize=(20,10))
)
Or, less error prone is to groupby on the numeric months along with the names, then discard the numbers:
(df.groupby([df['date'].dt.month,df['date'].dt.strftime('%B')])
['total'].sum()
.reset_index(level=0)
.plot.bar(figsize=(20,10))
)
Upvotes: 1
Reputation: 6025
Just add a new column as Month and sort by it:
df['Month'] = df['date'].dt.month
df.groupby('Month').sum().sort_values(by = 'Month')['total'].plot.bar(figsize=(20,10))
If you want order to be alphabetical, make month column accordingly:
df['Month'] = df['date'].dt.month_name()
df.groupby('Month').sum().sort_values(by = 'Month')['total'].plot.bar(figsize=(20,10))
results in:
Upvotes: 0
Reputation: 64
You need to tell the sort_values()
function how you'd like it to do so. Try replacing that part of the expression with sort_values(by='date')
Upvotes: 0