user3556757
user3556757

Reputation: 3619

A 'pythonic' way to generate a seasonal dataframe from a pandas timeseries dataframe

I have a pandas dataframe that captures values over a timespan (maybe monthly over years, or daily over years, or daily over months). There is no guarantee that the time series is continuous (some months might be missing in a year)

""" no guarantee that this index will have an entry for every month of the time range!"""
dates = pd.date_range('1/1/2015', periods=36, freq='M')
df = pd.DataFrame(index = dates)
df['value'] = df.index.year * 0.1 + df.index.month * 0.05
df.plot()

It can give me a simple time series plot

enter image description here But what I want to make is a 'seasonal' plot. This would display each year's data as a different line on the same index of months. As a simple display:

import numpy as np
index = ['jan', 'feb', 'mar', 'apr', 'may', 'jun', 'jul', 'aug', 'sep', 'oct', 'nov', 'dec']
df = pd.DataFrame(index = index)
df[2015] = np.arange(12)*0.4+1
df[2016] = np.arange(12)*0.35+1.4
df[2017] = np.arange(12)*0.5+1.2

df.plot()

enter image description here

I'm looking for a 'pythonic' or elegant way to do this operation. My attempts to transform have been incredibly gross, spaghetti, garbage code. I am sure there must be some tidy approach using pandas/python to display this transformation efficiently and cleanly In particular, I want to find an abstracted way to do this, so that I can generalize it to making charts showing "seasonality" of days across a month, etc.

To start with, I'm not even sure what is a good index to build and base this chart off of.

Upvotes: 1

Views: 2463

Answers (1)

jezrael
jezrael

Reputation: 863166

You can use DatetimeIndex.strftime and DatetimeIndex.year and for correct ordering use sorted CategoricalIndex, last reshape by pivot:

c = ['Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun', 'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec']

df = pd.pivot(index=pd.CategoricalIndex(df.index.strftime('%b'), ordered=True, categories=c),
              columns=df.index.year,
              values=df['value'])
print (df)

       2015    2016    2017
Jan  201.55  201.65  201.75
Feb  201.60  201.70  201.80
Mar  201.65  201.75  201.85
Apr  201.70  201.80  201.90
May  201.75  201.85  201.95
Jun  201.80  201.90  202.00
Jul  201.85  201.95  202.05
Aug  201.90  202.00  202.10
Sep  201.95  202.05  202.15
Oct  202.00  202.10  202.20
Nov  202.05  202.15  202.25
Dec  202.10  202.20  202.30

df.plot()

Another solution is create new columns:

df['months'] = pd.CategoricalIndex(df.index.strftime('%b'), ordered=True, categories=c)
df['years'] = df.index.year
df = df.pivot(index='months', columns='years',values='value')

Upvotes: 4

Related Questions