P.J
P.J

Reputation: 197

How to get the time duration from two date-time columns of pandas dataframe?

I have two date-time columns in my pandas data frame; How can I find the difference in hours (numeric)?

Upvotes: 1

Views: 2367

Answers (1)

Space Impact
Space Impact

Reputation: 13255

Take seconds from timedelta then divide it with 60*60. Dividing because 60 Seconds -> 1 Minute and 60 Minutes -> 1 Hour so the output will be in the number of hours.

Code:

t1 = pd.to_datetime('2018-07-30 19:03:04')
t2 = pd.to_datetime('2018-07-31 11:00:48')
(t2-t1).seconds/(60*60)
15.962222222222222

When you subtract them you will get default view of timedelta representation where first it shows days then hours followed by minutes,seconds and microseconds. That is the you are getting 15:57:44 value.

t2-t1
Timedelta('0 days 15:57:44')

The value of 15:57:44 is rounded because the t1 and t2 does not have microseconds in the question.

To use this concept on series you need to use apply method like below. The DateTime columns are generated and shuffled.

Code:

dates = pd.date_range('2018-08-23 02:34:54','2018-08-24 00:00:00',periods=22).values
np.random.shuffle(dates)
df = pd.DataFrame(dates.reshape(11,2),columns=['t1','t2'])
df
                               t1                              t2
0   2018-08-23 21:57:36.571428608   2018-08-23 08:42:04.285714176
1   2018-08-23 07:40:52.571428608   2018-08-23 22:58:48.285714176
2   2018-08-23 16:51:38.000000000   2018-08-23 11:45:39.428571392
3   2018-08-24 00:00:00.000000000   2018-08-23 04:37:17.428571392
4   2018-08-23 10:44:27.714285824   2018-08-23 12:46:51.142857216
5   2018-08-23 06:39:40.857142784   2018-08-23 13:48:02.857142784
6   2018-08-23 15:50:26.285714176   2018-08-23 03:36:05.714285824
7   2018-08-23 02:34:54.000000000   2018-08-23 09:43:16.000000000
8   2018-08-23 05:38:29.142857216   2018-08-23 18:54:01.428571392
9   2018-08-23 14:49:14.571428608   2018-08-23 19:55:13.142857216
10  2018-08-23 20:56:24.857142784   2018-08-23 17:52:49.714285824

(df['t1']-df['t2']).apply(lambda x: x.seconds/(60*60))
0     13.258889
1      8.701111
2      5.099444
3     19.378333
4     21.960000
5     16.860556
6     12.238889
7     16.860556
8     10.740833
9     18.900278
10     3.059722
dtype: float64

Update: Faster approach than above:

(df['t1']-df['t2'])/np.timedelta64(1,'h')

Upvotes: 4

Related Questions