Reputation: 197
I have two date-time columns in my pandas data frame; How can I find the difference in hours (numeric)?
For example the duration from 2018-07-30 19:03:04 to 2018-07-31 11:00:48 is 15.962 hours. When I subtract two columns, I get 15:57:43.430000, which is not desired.
Upvotes: 1
Views: 2367
Reputation: 13255
Take seconds from timedelta
then divide it with 60*60
.
Dividing because 60 Seconds -> 1 Minute
and 60 Minutes -> 1 Hour
so the output will be in the number of hours.
Code:
t1 = pd.to_datetime('2018-07-30 19:03:04')
t2 = pd.to_datetime('2018-07-31 11:00:48')
(t2-t1).seconds/(60*60)
15.962222222222222
When you subtract them you will get default view of timedelta representation where first it shows days then hours followed by minutes,seconds and microseconds. That is the you are getting 15:57:44
value.
t2-t1
Timedelta('0 days 15:57:44')
The value of 15:57:44
is rounded because the t1 and t2 does not have microseconds in the question.
To use this concept on series
you need to use apply
method like below.
The DateTime
columns are generated and shuffled.
Code:
dates = pd.date_range('2018-08-23 02:34:54','2018-08-24 00:00:00',periods=22).values
np.random.shuffle(dates)
df = pd.DataFrame(dates.reshape(11,2),columns=['t1','t2'])
df
t1 t2
0 2018-08-23 21:57:36.571428608 2018-08-23 08:42:04.285714176
1 2018-08-23 07:40:52.571428608 2018-08-23 22:58:48.285714176
2 2018-08-23 16:51:38.000000000 2018-08-23 11:45:39.428571392
3 2018-08-24 00:00:00.000000000 2018-08-23 04:37:17.428571392
4 2018-08-23 10:44:27.714285824 2018-08-23 12:46:51.142857216
5 2018-08-23 06:39:40.857142784 2018-08-23 13:48:02.857142784
6 2018-08-23 15:50:26.285714176 2018-08-23 03:36:05.714285824
7 2018-08-23 02:34:54.000000000 2018-08-23 09:43:16.000000000
8 2018-08-23 05:38:29.142857216 2018-08-23 18:54:01.428571392
9 2018-08-23 14:49:14.571428608 2018-08-23 19:55:13.142857216
10 2018-08-23 20:56:24.857142784 2018-08-23 17:52:49.714285824
(df['t1']-df['t2']).apply(lambda x: x.seconds/(60*60))
0 13.258889
1 8.701111
2 5.099444
3 19.378333
4 21.960000
5 16.860556
6 12.238889
7 16.860556
8 10.740833
9 18.900278
10 3.059722
dtype: float64
Update: Faster approach than above:
(df['t1']-df['t2'])/np.timedelta64(1,'h')
Upvotes: 4