Reputation: 271
I have a dataframe as shown below:
Events | Start DateTime | End DateTime | Max(in/hr) | Total Rainfall(in)
------------------------------------------------------------------------------
Event 1 | 2016-08-15 3:10 | 2016-08-15 19:08 | 3.67500 | 2.294
----------------------------------------------------------------------------
Event 2 | 2016-08-16 6:50 | 2016-08-17 12:57 | 1.41600 | 2.1680
-----------------------------------------------------------------------------
| |
------------------------------------------------------------------------------
Event 47 | 2016-08-01 13:30 | 2016-08-01 17:03 | 0.1371 | 0.1023
------------------------------------------------------------------------------
I would like to rearrange the data frame such that, Start datetime is increased by 5 minutes time interval upto end date time and other columns are sorted accordingly. Desired output is as shown below:
Time | Max (in/h) | Total Rainfall(in) | Events
------------------------------------------------------------------------------
2016-08-15 3:10 | 3.67500 | 2.294 | Event 1
----------------------------------------------------------------------------
2016-08-15 3:15 | 3.67500 | 2.294 | Event 1
-----------------------------------------------------------------------------
2016-08-15 3:20 | 3.67500 | 2.294 | Event 1
------------------------------------------------------------------------------
||
------------------------------------------------------------------------------
2016-08-15 19:08 | 3.67500 | 2.294 | Event 1 (Upto End datetime)
-----------------------------------------------------------------------------
||
------------------------------------------------------------------------------
2016-08-01 17:03 | 0.1371 | 0.1023 | Event 47 (Similarly for all events)
------------------------------------------------------------------------------
Upvotes: 1
Views: 49
Reputation: 862611
First reshape by melt
and then groupby
and resample
each 5 Min
with ffill
:
df['Start DateTime'] = pd.to_datetime(df['Start DateTime'])
df['End DateTime'] = pd.to_datetime(df['End DateTime'])
df1 = pd.melt(
df, ['Max(in/hr)', 'Total Rainfall(in)', 'Events'],
['Start DateTime', 'End DateTime'],
value_name='Time'
).drop('variable', 1).set_index('Time')
print (df1)
Max(in/hr) Total Rainfall(in) Events
Time
2016-08-15 03:10:00 3.6750 2.2940 Event 1
2016-08-16 06:50:00 1.4160 2.1680 Event 2
2016-08-01 13:30:00 0.1371 0.1023 Event 47
2016-08-15 19:08:00 3.6750 2.2940 Event 1
2016-08-17 12:57:00 1.4160 2.1680 Event 2
2016-08-01 17:03:00 0.1371 0.1023 Event 47
df2 = df1.groupby('Events')
.resample('5T')
.ffill()
.reset_index(level=0, drop=True)
.reset_index()
print (df2)
Time Max(in/hr) Total Rainfall(in) Events
0 2016-08-15 03:10:00 3.6750 2.2940 Event 1
1 2016-08-15 03:15:00 3.6750 2.2940 Event 1
2 2016-08-15 03:20:00 3.6750 2.2940 Event 1
3 2016-08-15 03:25:00 3.6750 2.2940 Event 1
4 2016-08-15 03:30:00 3.6750 2.2940 Event 1
5 2016-08-15 03:35:00 3.6750 2.2940 Event 1
6 2016-08-15 03:40:00 3.6750 2.2940 Event 1
7 2016-08-15 03:45:00 3.6750 2.2940 Event 1
8 2016-08-15 03:50:00 3.6750 2.2940 Event 1
9 2016-08-15 03:55:00 3.6750 2.2940 Event 1
...
...
Upvotes: 1