user13420356
user13420356

Reputation:

How to pivot a pandas df where each column header is an hour and each row is a date

So I have a pandas df that looks like this enter image description here

where each column is an hour of the day noted in the date column. I would like to pivot this df so each hour of the day is its own row. Similar to this

enter image description here

where there would be 24 rows for every hour of each date.

I've tried to used pd.melt using the following

hourly_value = ['00:00','01:00','02:00','03:00','04:00','05:00','06:00','07:00','08:00','09:00','10:00','11:00','12:00']
df = df.melt(id_vars = ['DATE'], var_name = hourly_value, value_name = ('Hourly Precip'))

but keep getting error "IndexError: Too many levels: Index has only 1 level, not 2". I've also looked into using df.pivot but Im starting to think my df is in a much different format than most of the examples.

Upvotes: 0

Views: 736

Answers (2)

Alex K
Alex K

Reputation: 41

One way to get what you want is to:

  1. Use .set_index('DATE') to turn the DATE column into the index.

  2. Use .stack() to bring the columns into the index as well, creating a MultiIndex where the row for each date gets inserted as a second level in the index.

  3. Use .reset_index() to turn all index levels back into rows.

The following snippet illustrates:

import numpy as np
import pandas as pd

dates = [f"1/{i}/2020" for i in range(1, 21)]
cols = ["DATE"] + [str(i) + ":00" for i in range(25)]
zeros = np.zeros((len(dates), len(cols) - 1))
data = list([[x] + list(y) for x, y in zip(dates, zeros)])

df = pd.DataFrame(data=data, columns=cols)

df2 = (
    df.set_index("DATE") # makes the DATE column the index
    .stack()             # stacks 
    .reset_index()
    .rename(columns={"level_1": "Time", 0: "Value"})
)
print(df2.head())

Which outputs:

       DATE  Time  Value
0  1/1/2020  0:00    0.0
1  1/1/2020  1:00    0.0
2  1/1/2020  2:00    0.0
3  1/1/2020  3:00    0.0
4  1/1/2020  4:00    0.0

Upvotes: 1

Mehdi Golzadeh
Mehdi Golzadeh

Reputation: 2583

Try this :

 pd.melt( df.reset_index(), id_vars=['DATE'], var_name='hour', value_name='Hourly Precip')

Upvotes: 0

Related Questions