Evan Brittain
Evan Brittain

Reputation: 587

How to create a pivot table using a datetime column in pandas

I have a datetime column and value column which I would like to pivot. The goal is to create a column for each month and a row that shows the mean value for each month.

import pandas as pd
import numpy as np
import calendar

d = dict(enumerate(calendar.month_abbr))

rng = pd.date_range('2019-01-01', periods=365, freq='D')
df= pd.DataFrame({'Date': rng, 'Val': np.random.randint(10, size=365)}) 
df.set_index('Date', inplace=True)

df = df.resample('1M').mean().reset_index()
df['Month'] = df['Date'].apply(lambda x: d[x.month])

df.pivot(columns='Month', values='Val')

The output should be 12 columns Jan, Feb, Mar, etc... and 1 row that is the mean for each month.

Upvotes: 0

Views: 505

Answers (2)

Andy L.
Andy L.

Reputation: 25239

Force df.index to all 0 and call your pivot command with reindex

df.index = [0]*df.index.size    
df_out = df.pivot(columns='Month', values='Val').reindex(columns=df.Month)

Or do direct one-liner

df_out = (df.set_index(np.array([0]*df.index.size))
            .pivot(columns='Month', values='Val').reindex(columns=df.Month))


Out[88]:
Month       Jan   Feb       Mar       Apr       May  Jun       Jul       Aug  \
0      4.290323  3.75  4.032258  4.033333  4.225806  4.4  4.774194  4.774194

Month  Sep      Oct       Nov       Dec
0      4.6  4.16129  4.233333  3.935484

If you don't want to change df.index as above, you may just use pivot with ffill, bfill and iloc

df_out = (df.pivot(columns='Month', values='Val').ffill().bfill().iloc[[0]]
            .reindex(columns=df.Month))


Out[99]:
Month       Jan   Feb       Mar       Apr       May  Jun       Jul       Aug  \
0      4.290323  3.75  4.032258  4.033333  4.225806  4.4  4.774194  4.774194

Month  Sep      Oct       Nov       Dec
0      4.6  4.16129  4.233333  3.935484

Upvotes: 0

Matt
Matt

Reputation: 96

Use pd.pivot_table instead:

import pandas as pd
import numpy as np
import calendar

d = dict(enumerate(calendar.month_abbr))

rng = pd.date_range('2019-01-01', periods=365, freq='D')
df= pd.DataFrame({'Date': rng, 'Val': np.random.randint(10, size=365)}) 
df.set_index('Date', inplace=True)

df = df.resample('1M').mean().reset_index()
df['Month'] = df['Date'].apply(lambda x: d[x.month])

pd.pivot_table(data=df,columns='Month', values='Val', aggfunc=np.mean)

output:

Month  Apr       Aug       Dec       Feb       Jan       Jul       Jun  \
Val    3.2  4.419355  4.548387  5.857143  5.322581  4.354839  5.033333   

Month       Mar       May       Nov       Oct  Sep  
Val    4.645161  4.193548  4.966667  3.645161  3.7  

Upvotes: 1

Related Questions