Joost Döbken
Joost Döbken

Reputation: 4007

NumPy array of integers to timedelta

I have a numpy array of milliseconds in integers, which I want to convert to an array of Python datetimes via a timedelta operation.

The following MWE works, but I'm convinced there is a more elegant approach or with better performence than multiplication by 1 ms.

start = pd.Timestamp('2016-01-02 03:04:56.789101').to_pydatetime()
dt = np.array([      19,    14980,    19620, 54964615, 54964655, 86433958])
time_arr = start +  dt * timedelta(milliseconds=1)

Upvotes: 5

Views: 11950

Answers (1)

hpaulj
hpaulj

Reputation: 231335

So your approach produces:

In [56]: start = pd.Timestamp('2016-01-02 03:04:56.789101').to_pydatetime()
In [57]: start
Out[57]: datetime.datetime(2016, 1, 2, 3, 4, 56, 789101)
In [58]: dt = np.array([      19,    14980,    19620, 54964615, 54964655, 86433958])
In [59]: time_arr = start +  dt * timedelta(milliseconds=1)
In [60]: time_arr
Out[60]: 
array([datetime.datetime(2016, 1, 2, 3, 4, 56, 808101),
       datetime.datetime(2016, 1, 2, 3, 5, 11, 769101),
       datetime.datetime(2016, 1, 2, 3, 5, 16, 409101),
       datetime.datetime(2016, 1, 2, 18, 21, 1, 404101),
       datetime.datetime(2016, 1, 2, 18, 21, 1, 444101),
       datetime.datetime(2016, 1, 3, 3, 5, 30, 747101)], dtype=object)

The equivalent using np.datetime64 types:

In [61]: dt.astype('timedelta64[ms]')
Out[61]: array([      19,    14980,    19620, 54964615, 54964655, 86433958], dtype='timedelta64[ms]')
In [62]: np.datetime64(start)
Out[62]: numpy.datetime64('2016-01-02T03:04:56.789101')
In [63]: np.datetime64(start) + dt.astype('timedelta64[ms]')
Out[63]: 
array(['2016-01-02T03:04:56.808101', '2016-01-02T03:05:11.769101',
       '2016-01-02T03:05:16.409101', '2016-01-02T18:21:01.404101',
       '2016-01-02T18:21:01.444101', '2016-01-03T03:05:30.747101'], dtype='datetime64[us]')

I can produce the same array from your time_arr with np.array(time_arr, dtype='datetime64[us]').

tolist converts these datetime64 items to datetime objects:

In [97]: t1=np.datetime64(start) + dt.astype('timedelta64[ms]')
In [98]: t1.tolist()
Out[98]: 
[datetime.datetime(2016, 1, 2, 3, 4, 56, 808101),
 datetime.datetime(2016, 1, 2, 3, 5, 11, 769101),
 datetime.datetime(2016, 1, 2, 3, 5, 16, 409101),
 datetime.datetime(2016, 1, 2, 18, 21, 1, 404101),
 datetime.datetime(2016, 1, 2, 18, 21, 1, 444101),
 datetime.datetime(2016, 1, 3, 3, 5, 30, 747101)]

or wrap it back in an array to get your time_arr:

In [99]: np.array(t1.tolist())
Out[99]: 
array([datetime.datetime(2016, 1, 2, 3, 4, 56, 808101),
       ...
       datetime.datetime(2016, 1, 3, 3, 5, 30, 747101)], dtype=object)

Just for the calculation datatime64 is faster, but with the conversions, it may not be the fastest overall.

https://docs.scipy.org/doc/numpy/reference/arrays.datetime.html

Upvotes: 8

Related Questions