nish
nish

Reputation: 271

How to rearrage the dataframe by combining 2 columns and sorting the corresponding rows to columns in pandas?

I have a dataframe as shown below:

Events   |  Start DateTime  |   End DateTime   | Max(in/hr) | Total Rainfall(in)
------------------------------------------------------------------------------
Event 1  | 2016-08-15 3:10  | 2016-08-15 19:08 | 3.67500     |     2.294
----------------------------------------------------------------------------
Event 2  | 2016-08-16 6:50  | 2016-08-17 12:57 | 1.41600     |     2.1680
-----------------------------------------------------------------------------
     |          |  
------------------------------------------------------------------------------
Event 47 | 2016-08-01 13:30 | 2016-08-01 17:03 | 0.1371      |     0.1023
------------------------------------------------------------------------------

I would like to rearrange the data frame such that, Start datetime is increased by 5 minutes time interval upto end date time and other columns are sorted accordingly. Desired output is as shown below:

Time  |  Max (in/h) |   Total Rainfall(in)  | Events 
------------------------------------------------------------------------------
2016-08-15 3:10  |  3.67500 |     2.294     | Event 1
----------------------------------------------------------------------------
2016-08-15 3:15  |  3.67500 |     2.294     | Event 1     
-----------------------------------------------------------------------------
2016-08-15 3:20  |  3.67500 |     2.294     | Event 1       
------------------------------------------------------------------------------
||
------------------------------------------------------------------------------
2016-08-15 19:08 |  3.67500 |     2.294     | Event 1   (Upto End datetime)
-----------------------------------------------------------------------------
||
------------------------------------------------------------------------------
2016-08-01 17:03 |  0.1371  |     0.1023    | Event 47  (Similarly for all events)
------------------------------------------------------------------------------

Upvotes: 1

Views: 49

Answers (1)

jezrael
jezrael

Reputation: 862611

First reshape by melt and then groupby and resample each 5 Min with ffill:

df['Start DateTime'] = pd.to_datetime(df['Start DateTime'])
df['End DateTime'] = pd.to_datetime(df['End DateTime'])

df1 = pd.melt(
    df, ['Max(in/hr)', 'Total Rainfall(in)', 'Events'],
    ['Start DateTime', 'End DateTime'],
    value_name='Time'
).drop('variable', 1).set_index('Time')
print (df1)
                     Max(in/hr)  Total Rainfall(in)    Events
Time                                                         
2016-08-15 03:10:00      3.6750              2.2940   Event 1
2016-08-16 06:50:00      1.4160              2.1680   Event 2
2016-08-01 13:30:00      0.1371              0.1023  Event 47
2016-08-15 19:08:00      3.6750              2.2940   Event 1
2016-08-17 12:57:00      1.4160              2.1680   Event 2
2016-08-01 17:03:00      0.1371              0.1023  Event 47
df2 = df1.groupby('Events')
         .resample('5T')
         .ffill()
         .reset_index(level=0, drop=True)
         .reset_index()
print (df2)
                   Time  Max(in/hr)  Total Rainfall(in)    Events
0   2016-08-15 03:10:00      3.6750              2.2940   Event 1
1   2016-08-15 03:15:00      3.6750              2.2940   Event 1
2   2016-08-15 03:20:00      3.6750              2.2940   Event 1
3   2016-08-15 03:25:00      3.6750              2.2940   Event 1
4   2016-08-15 03:30:00      3.6750              2.2940   Event 1
5   2016-08-15 03:35:00      3.6750              2.2940   Event 1
6   2016-08-15 03:40:00      3.6750              2.2940   Event 1
7   2016-08-15 03:45:00      3.6750              2.2940   Event 1
8   2016-08-15 03:50:00      3.6750              2.2940   Event 1
9   2016-08-15 03:55:00      3.6750              2.2940   Event 1
...
...

Upvotes: 1

Related Questions