MaximeMusterFrau
MaximeMusterFrau

Reputation: 59

How to sum dtype: timedelta64[ns] in pandas/Python?

I am using pands in python to deal with times. I would like to sum up the time elapsed between a couple of dates, which are:

0   2012-03-06 14:22:00
0   2012-06-02 11:29:00


1   2012-04-16 20:51:00
1   2012-04-28 09:57:00

To do this, I calculate the time elapsed between the first 2 dates indexed with 0 like this :

dt0 = df.end[0] - df.start[0]  
out: 87 days 21:07:00
dtype: timedelta64[ns]

and the same between the next 2 dates like:

dt1 = df.end[1] - df.start[1]  
out: 11 days 13:06:00
dtype: timedelta64[ns]

Which works fine, but when I sum the two times :

dt2 = dt1 + dt0 

I get dt2 = NaT instead of the sum of 87 days 21:07:00 + 11 days 13:06:00. Can anyone help?

Below is a screenshot of another example of the same problem: adding up a and b, two dtype: timedelta64[ns] does not work, why?

enter image description here

Upvotes: 2

Views: 1200

Answers (1)

ALollz
ALollz

Reputation: 59579

See, this is why I explicitly wanted you to print the types. dt1 and dt0 are NOT <class 'pandas._libs.tslibs.timedeltas.Timedelta'>, they are pandas.Series.

When you add two Series it aligns based on index. Since dt1 and dt2 do not share the same index, it fills the missing values with a null-value (NaT in this case) and then performs the addition. By default it does not ignore null values when performing the addition, so what you are seeing is x + NaT = NaT which is how the math works.

Sample Data

import pandas as pd

a = pd.Series(pd.Timedelta(1,'d'), index=[21005])
#21005   1 days
#dtype: timedelta64[ns]

b = pd.Series(pd.Timedelta(2,'d'), index=[16992])
#16992   2 days
#dtype: timedelta64[ns]

Code

Addition will align on indices. They share no indices so you get NaT.

a+b
#16992   NaT
#21005   NaT
#dtype: timedelta64[ns]

What you really want to do is add the values, regardless of index:

a.values+b.values
#array([259200000000000], dtype='timedelta64[ns]')

But really you should change your code so that dt0 and dt1 are just the values if you actually have no need for the pd.Series.

Upvotes: 2

Related Questions