Reputation: 25997
I have a dataframe like this:
index = ['2018-02-17 00:30:00', '2018-02-17 07:00:00',
'2018-02-17 13:00:00', '2018-02-17 19:00:00',
'2018-02-18 00:00:00', '2018-02-18 07:00:00',
'2018-02-18 10:30:00', '2018-02-18 13:00:00']
df = pd.DataFrame({'col': list(range(len(index)))})
df.index = pd.to_datetime(index)
col
2018-02-17 00:30:00 0
2018-02-17 07:00:00 1
2018-02-17 13:00:00 2
2018-02-17 19:00:00 3
2018-02-18 00:00:00 4
2018-02-18 07:00:00 5
2018-02-18 10:30:00 6
2018-02-18 13:00:00 7
and would like to add a column that reflects the actual duration in hours, so my desired outcome looks like this:
col time_range
2018-02-17 00:30:00 0 0.0
2018-02-17 07:00:00 1 6.5
2018-02-17 13:00:00 2 12.5
2018-02-17 19:00:00 3 18.5
2018-02-18 00:00:00 4 23.5
2018-02-18 07:00:00 5 30.5
2018-02-18 10:30:00 6 34.0
2018-02-18 13:00:00 7 36.5
I currently do this as follows:
df['time_range'] = [(ti - df.index[0]).delta / (10 ** 9 * 60 * 60) for ti in df.index]
Is there a smarter (i.e. vectorized/built-in) way of doing this?
Upvotes: 1
Views: 89
Reputation: 862661
Use:
df['new'] = (df.index - df.index[0]).total_seconds() / 3600
Or:
df['new'] = (df.index - df.index[0]) / np.timedelta64(1, 'h')
print (df)
col new
2018-02-17 00:30:00 0 0.0
2018-02-17 07:00:00 1 6.5
2018-02-17 13:00:00 2 12.5
2018-02-17 19:00:00 3 18.5
2018-02-18 00:00:00 4 23.5
2018-02-18 07:00:00 5 30.5
2018-02-18 10:30:00 6 34.0
2018-02-18 13:00:00 7 36.5
Upvotes: 2