Johnny
Johnny

Reputation: 361

Plot a pandas dataframe using matplotlib with data grouped by year/month

I'm using jupyter, pandas and matplotlib to create a plot with the following data.

How do I create a plot that groups the data together in months and years on the x axis to make it clearer that the month is associated with a year

year    month count
2005    9   40789
2005    10  17998
...
2014    12  2168
2015    1   2286
2015    2   1274
2015    3   1126
2015    4   344
df.plot(kind='bar',x='month',y='num',color='blue', title="Num per year")
plt.show()

enter image description here

Upvotes: 1

Views: 1311

Answers (2)

jayveesea
jayveesea

Reputation: 3199

You could color each year a different color.

Create some data:

import pandas as pd
import matplotlib.pyplot as plt
from matplotlib import cm
import numpy as np

# here's some data
N=50
df = pd.DataFrame({'year': np.random.randint(2005,2015,N),
                   'month': np.random.randint(1,12,N),
                   'count': np.random.randint(1,1500,N)})
df.sort_values(by=['year', 'month'],inplace=True)

And then create a color array with a color for each year:

# color map based on years
yrs = np.unique(df.year)
c = cm.get_cmap('tab20', len(yrs))
## probably a more elegant way to do this...
yrClr = np.zeros((len(df.year),4))
for i, v in enumerate(yrs): 
    yrClr[df.year==v,:]=c.colors[i,:]

# then use yrClr for color               
df.plot(kind='bar', x='month', y='count', color=yrClr, title="Num per year")

UPDATE: it might also help to have your x axis combined Month+Year, like this.

fig, axs = plt.subplots(figsize=(12, 4))
df['MonthYr']=pd.to_datetime(df.assign(day=1)[['year','month','day']]).dt.strftime('%m-%Y')
df.plot(kind='bar', x='MonthYr', y='count', color=yrClr, title="Num per year",ax=axs)

enter image description here

Upvotes: 3

Quang Hoang
Quang Hoang

Reputation: 150735

You can use sns.barplot with hue and dodge:

sns.barplot(data=df, x='year', hue='month', y='count', dodge=True)

Or you can pivot the table and use plot.bar():

(df.pivot_table(index='year', columns='month', 
               values='count', aggfunc='sum')
   .plot.bar()
)

which would give you something like this:

enter image description here

Upvotes: 2

Related Questions