Eamon Murphy
Eamon Murphy

Reputation: 65

Pandas Pivot with datetime index

I am having some troubles pivoting a dataframe with a datetime value as the index. my df looks like this:

Timestamp           Value
2016-01-01 00:00:00 16.546900
2016-01-01 01:00:00 16.402375
2016-01-01 02:00:00 16.324250

Where the timestamp is a, datetime64[ns]. I am trying to pivot the table so that it looks like this.


Hour       0    1    2    4   ....
Date
2016-01-01 16.5 16.4 16.3 17  ....
....
....

I've tried using the code below but am getting an error when I run it.

df3 =  pd.pivot_table(df2,index=np.unique(df2.index.date),columns=np.unique(df2.index.hour),values=df2.Temp)

KeyError                                  Traceback (most recent call last)
 in ()
      1 # Pivot Table
----> 2 df3 = pd.pivot_table(df2,index=np.unique(df2.index.date),columns=np.unique(df2.index.hour),values=df2.Temp)

~\Anaconda3\lib\site-packages\pandas\core\reshape\pivot.py in pivot_table(data, values, index, columns, aggfunc, fill_value, margins, dropna, margins_name) 56 for i in values: 57 if i not in data: ---> 58 raise KeyError(i) 59 60 to_filter = []

KeyError: 16.5469

Any help or insights would be greatly appreciated.

Upvotes: 5

Views: 11692

Answers (3)

Iotatron
Iotatron

Reputation: 73

Adapting @Seanny123 's answer above for an arbitrary cadence:

start = [2018, 1, 1, 0, 0, 0]
end = [date.today().year, date.today().month, date.today().day]
quant='freq'
sTime_tmp = datetime.datetime(start[0], start[1], start[2], tzinfo = pytz.UTC)
eTime_tmp = datetime.datetime(end[0], end[1], end[2], tzinfo = pytz.UTC)
cadence = '5min'
t = pd.date_range(start=sTime_tmp,
                  end=eTime_tmp,
                  freq = cadence)
keo = pd.DataFrame(np.nan, index=t, columns=[quant])
keo[quant] = 0
keo = pd.pivot_table(keo, index=keo.index.time, columns=keo.index.date, values=quant)

keo

Upvotes: 0

Seanny123
Seanny123

Reputation: 9346

A different way of accomplishing this without the lambda is to create the indices from the DateTimeIndex.

df2 = pd.pivot_table(df, index=df.index.date, columns=df.index.hour, values="Value")

Upvotes: 4

gyoza
gyoza

Reputation: 2152

I slightly extended input data like below (assuming no duplicated entries in the same date/hour)

Timestamp           Value
2016-01-01 00:00:00 16.546900
2016-01-01 01:00:00 16.402375
2016-01-01 02:00:00 16.324250
2016-01-01 04:00:00 16.023928
2016-01-03 04:00:00 16.101919
2016-01-05 23:00:00 13.405928

It looks a bit awkward, but something like below works.

df2['Date'] = df2.Timestamp.apply(lambda x: str(x).split(" ")[0])
df2['Hour'] = df2.Timestamp.apply(lambda x: str(x).split(" ")[1].split(":")[0])
df3 = pd.pivot_table(df2, values='Value', index='Date', columns='Hour')

[Output]

Hour        00      01        02        04          23
Date                    
2016-01-01  16.5469 16.402375 16.32425  16.023928   NaN
2016-01-03  NaN     NaN       NaN       16.101919   NaN
2016-01-05  NaN     NaN       NaN       NaN         13.405928

Finally if your columns need to be integer,

df3.columns = [int(x) for x in df3.columns]

Hope this helps.

Upvotes: 2

Related Questions