BhishanPoudel
BhishanPoudel

Reputation: 17144

How to plot timeseries using pandas with monthly groupby?

I was trying to plot a time series after grouping by month but I am still getting years on x-axis labels instead of months. How can we get months on x-axis labels and different curves for different years?

Here is my attempt:

import numpy as np
import pandas as pd
import statsmodels.api as sm

df = pd.DataFrame.from_records(sm.datasets.co2.load().data)
df['index'] = pd.to_datetime(df['index'])
df = df.set_index('index')

ts = df['co2']['1960':]
ts = ts.bfill()
ts = ts.resample('MS').sum()

ts.groupby(ts.index.month).plot()

Wanted:

months names on x-axis of plots and different curves for different years.

The plot should look like something similar to:

Upvotes: 2

Views: 11001

Answers (2)

BhishanPoudel
BhishanPoudel

Reputation: 17144

You can start with this:

ts.groupby([ts.index.month,ts.index.year]).sum().unstack().plot(figsize=(12,8))

Update

import numpy as np
import pandas as pd
import calendar
import seaborn as sns
sns.set(color_codes=True)

import matplotlib
import matplotlib.pyplot as plt
%matplotlib inline


df = pd.read_csv('https://github.com/selva86/datasets/raw/master/AirPassengers.csv',
                 parse_dates=['date'],index_col=['date'])


ts = df['value']

df_plot = ts.groupby([ts.index.month,ts.index.year]).sum().unstack()
df_plot

fig, ax = plt.subplots(figsize=(12,8))
df_plot.plot(ax=ax,legend=False)

# xticks
months = [calendar.month_abbr[i] for i in range(1,13)]
ax.set_xticks(range(12))
ax.set_xticklabels(months)

# plot names in the end
for col in df_plot.columns:
    plt.annotate(col,xy=(plt.xticks()[0][-1]+0.7, df_plot[col].iloc[-1]))

enter image description here

Upvotes: 1

rhedak
rhedak

Reputation: 409

I think you're looking for pandas.to_datetime() and then use the .month or .year propery of the dattime index.

Also by using statsmodel's 'as_pandas=True' your code becomes a bit shorter

Anyways if you want to plot the month as hue I recommend using seaborn over matplotlib

import pandas as pd
import statsmodels.api as sm
import seaborn as sns

df = sm.datasets.co2.load(as_pandas=True).data
df['month'] = pd.to_datetime(df.index).month
df['year'] = pd.to_datetime(df.index).year
sns.lineplot(x='month',y='co2',hue='year',data=df.query('year>1995')) # filtered over 1995 to make the plot less cluttered

this gives

enter image description here

Upvotes: 3

Related Questions