Reputation: 135
I have a pandas dataframe which contains data and the datetime range over which it applies:
data | start_time | end_time
0 | 2020-01-01-00 | 2020-01-01-02
1 | 2020-01-01-05 | 2020-01-01-08
I'd like to translate the data to an hourly time series, so it ends up as follows:
data | time
0 | 2020-01-01-00
0 | 2020-01-01-01
0 | 2020-01-01-02
1 | 2020-01-01-05
1 | 2020-01-01-06
1 | 2020-01-01-07
1 | 2020-01-01-08
I could brute force my way through this, but was wondering if there is built in functionality in pandas dataframes I could explore to help achieve this.
Upvotes: 5
Views: 145
Reputation: 672
You might try to create a temporal time serie dataframe and merge with yours, then select the relevant. As a suggestion:
df = pd.DataFrame({'data': [0, 1],'start_time': ['2020-01-01-00', '2020-01-01-05'],\
'end_time': ['2020-01-01-02', '2020-01-01-08']})
# Create a temporal time serie dataframe
df_ts = pd.DataFrame({'start_time': pd.date_range(df.start_time.min(), df.end_time.max(), freq='h')\
.strftime('%Y-%m-%d-%H')})
# Merge and pad the value between start & end time
new_df = df_ts.merge(df, how='left').fillna(method='pad')
# Select the relevant rows
new_df = new_df.loc[new_df.start_time <= new_df.end_time, ['data','start_time']]\
.rename(columns={'start_time': 'time'})
new_df
data time
0.0 2020-01-01-00
0.0 2020-01-01-01
0.0 2020-01-01-02
1.0 2020-01-01-05
1.0 2020-01-01-06
1.0 2020-01-01-07
1.0 2020-01-01-08
Upvotes: 0
Reputation: 29742
Using pandas.date_range
and pandas.DataFrame.explode
:
df['time'] = df.apply(lambda x: pd.date_range(x['start_time'], x['end_time'], freq='1H'), 1)
print(df.explode('time'))
Output:
data start_time end_time time
0 0 2020-01-01-00 2020-01-01-02 2020-01-01 00:00:00
0 0 2020-01-01-00 2020-01-01-02 2020-01-01 01:00:00
0 0 2020-01-01-00 2020-01-01-02 2020-01-01 02:00:00
1 1 2020-01-01-05 2020-01-01-08 2020-01-01 05:00:00
1 1 2020-01-01-05 2020-01-01-08 2020-01-01 06:00:00
1 1 2020-01-01-05 2020-01-01-08 2020-01-01 07:00:00
1 1 2020-01-01-05 2020-01-01-08 2020-01-01 08:00:00
Upvotes: 4