Reputation: 2418
According to the pandas 0.13.1 manual, you can reduce a numpy timedelta64 series:
http://pandas.pydata.org/pandas-docs/stable/timeseries.html#time-deltas-reductions
This seems to work fine with, for example, mean()
:
In[107]:
pd.Series(np.random.randint(0,100000,100).astype("timedelta64[ns]")).mean()
Out[107]:
0 00:00:00.000047
dtype: timedelta64[ns]
However, using sum()
, this always results in an integer:
In [108]:
pd.Series(np.random.randint(0,100000,100).astype("timedelta64[ns]")).sum()
Out[108]:
5047226
Is this a bug, or is there e.g. overflow that is causing this? Is it safe to cast the result into timedelta64
? How would I work around this?
I am using numpy 1.8.0.
Upvotes: 0
Views: 224
Reputation: 129068
Looks like a bug, just filed this: https://github.com/pydata/pandas/issues/6462
The results are in nanoseconds; as a work-around you can do this:
In [1]: s = pd.to_timedelta(range(4),unit='d')
In [2]: s
Out[2]:
0 0 days
1 1 days
2 2 days
3 3 days
dtype: timedelta64[ns]
In [3]: s.mean()
Out[3]:
0 1 days, 12:00:00
dtype: timedelta64[ns]
In [4]: s.sum()
Out[4]: 518400000000000
In [8]: pd.to_timedelta([s.sum()])
Out[8]:
0 6 days
dtype: timedelta64[ns]
Upvotes: 1