A 'pythonic' way to generate a seasonal dataframe from a pandas timeseries dataframe

Question

I have a pandas dataframe that captures values over a timespan (maybe monthly over years, or daily over years, or daily over months). There is no guarantee that the time series is continuous (some months might be missing in a year)

""" no guarantee that this index will have an entry for every month of the time range!"""
dates = pd.date_range('1/1/2015', periods=36, freq='M')
df = pd.DataFrame(index = dates)
df['value'] = df.index.year * 0.1 + df.index.month * 0.05
df.plot()

It can give me a simple time series plot

But what I want to make is a 'seasonal' plot. This would display each year's data as a different line on the same index of months. As a simple display:

import numpy as np
index = ['jan', 'feb', 'mar', 'apr', 'may', 'jun', 'jul', 'aug', 'sep', 'oct', 'nov', 'dec']
df = pd.DataFrame(index = index)
df[2015] = np.arange(12)*0.4+1
df[2016] = np.arange(12)*0.35+1.4
df[2017] = np.arange(12)*0.5+1.2

df.plot()

I'm looking for a 'pythonic' or elegant way to do this operation. My attempts to transform have been incredibly gross, spaghetti, garbage code. I am sure there must be some tidy approach using pandas/python to display this transformation efficiently and cleanly In particular, I want to find an abstracted way to do this, so that I can generalize it to making charts showing "seasonality" of days across a month, etc.

To start with, I'm not even sure what is a good index to build and base this chart off of.

jezrael · Accepted Answer

You can use DatetimeIndex.strftime and DatetimeIndex.year and for correct ordering use sorted CategoricalIndex, last reshape by pivot:

c = ['Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun', 'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec']

df = pd.pivot(index=pd.CategoricalIndex(df.index.strftime('%b'), ordered=True, categories=c),
              columns=df.index.year,
              values=df['value'])
print (df)

       2015    2016    2017
Jan  201.55  201.65  201.75
Feb  201.60  201.70  201.80
Mar  201.65  201.75  201.85
Apr  201.70  201.80  201.90
May  201.75  201.85  201.95
Jun  201.80  201.90  202.00
Jul  201.85  201.95  202.05
Aug  201.90  202.00  202.10
Sep  201.95  202.05  202.15
Oct  202.00  202.10  202.20
Nov  202.05  202.15  202.25
Dec  202.10  202.20  202.30

df.plot()

Another solution is create new columns:

df['months'] = pd.CategoricalIndex(df.index.strftime('%b'), ordered=True, categories=c)
df['years'] = df.index.year
df = df.pivot(index='months', columns='years',values='value')

A 'pythonic' way to generate a seasonal dataframe from a pandas timeseries dataframe

Answers (1)

Related Questions

A &#39;pythonic&#39; way to generate a seasonal dataframe from a pandas timeseries dataframe

Answers (1)

Related Questions

A 'pythonic' way to generate a seasonal dataframe from a pandas timeseries dataframe