Reputation: 123
I have one year of 5 minute data like this:
df = pd.DataFrame([['1/1/2019 00:05', 1], ['1/1/2019 00:10', 5],['1/1/2019 00:15', 1], ['1/1/2019 00:20',3], ['1/1/2019 00:25', 1],
['1/1/2019 00:30', 2], ['1/1/2019 00:35', 6],['1/1/2019 00:40', 8],['1/1/2019 00:45', 1], ['1/1/2019 00:55', 2],
['1/1/2019 01:00', 8],['1/1/2019 01:05', 1], ['1/1/2019 01:10', 5],['1/1/2019 01:15', 1], ['1/1/2019 01:20',3],['1/1/2019 01:25', 1],
['1/1/2019 01:30', 2], ['1/1/2019 01:35', 6],['1/1/2019 01:40', 8],['1/1/2019 01:45', 1], ['1/1/2019 01:55', 2],
['1/1/2019 02:00', 8]],
columns = ['Date','Value'])
And I'm looking to transpose it hourly for all the corresponding period. Now each row corresponds to one hour for a specific day and specific month. Something like this:
df = pd.DataFrame([['day1hour0month1', 1, 1, 3, 4, 1, 0, 1, 5, 2, 1, 3,3], ['day1hour1month1', 1, 1, 3, 4, 1, 0, 1, 5, 2, 1, 3,3],
['day1hour2month1', 1, 1, 3, 4, 1, 0, 1, 5, 2, 1, 3,3], ['day1hour3month1', 1, 1, 3, 4, 1, 0, 1, 5, 2, 1, 3,3],
['day1hour4month1', 1, 1, 3, 4, 1, 0, 1, 5, 2, 1, 3,3], ['day1hour5month1', 1, 1, 3, 4, 1, 0, 1, 5, 2, 1, 3,3],
['day1hour6month1', 1, 1, 3, 4, 1, 0, 1, 5, 2, 1, 3,3], ['day1hour7month1', 1, 1, 3, 4, 1, 0, 1, 5, 2, 1, 3,3],
['day1hour8month1', 1, 1, 3, 4, 1, 0, 1, 5, 2, 1, 3,3], ['day1hour9month1', 1, 1, 3, 4, 1, 0, 1, 5, 2, 1, 3,3],
['day31hour23month12', 1, 1, 8, 0, 6, 5, 3, 1, 1, 2,3,5]],
columns = ['Date', 'min05', 'min10', 'min15', 'min20', 'min25',
'min30', 'min35', 'min40', 'min45', 'min50',
'min55', 'min60'])
Is there any way to do it using Pandas time-series functionality (without using a for loop)? I would really appreciate any suggestion for implementing this operation.
Thank you in advance!
Cheers.
Upvotes: 0
Views: 55
Reputation: 34046
Based on your sample dataframe:
In [2213]: df['Date'] = pd.to_datetime(df['Date'])
In [2191]: df1['dmh'] = 'day' + df.Date.dt.day.astype(str) + 'hour' + df.Date.dt.hour.astype(str) + 'month' + df.Date.dt.month.astype(str)
In [2199]: df['minute'] = 'min' + df.Date.dt.minute.astype(str)
In [2211]: df.pivot(index='dmh', columns='minute', values='Value')
Out[2211]:
minute min0 min10 min15 min20 min25 min30 min35 min40 min45 min5 min55
dmh
day1hour0month1 NaN 5.0 1.0 3.0 1.0 2.0 6.0 8.0 1.0 1.0 2.0
day1hour1month1 8.0 5.0 1.0 3.0 1.0 2.0 6.0 8.0 1.0 1.0 2.0
day1hour2month1 8.0 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
Upvotes: 1