user7468395
user7468395

Reputation: 1349

How to convert int array back to pandas timestamp?

I am able to convert a numpy-array column of type pandas timestamp to an int array:

import numpy as np
import pandas as pd

df = pd.DataFrame({'a': [pd.datetime(2019, 1, 11, 5, 30, 1), pd.datetime(2019, 1, 11, 5, 30, 1), pd.datetime(2019, 1, 11, 5, 30, 1)], 'b': [np.nan, 5.1, 1.6]})

a = df.to_numpy()
a
# array([[Timestamp('2019-01-11 05:30:01'), nan],
#       [Timestamp('2019-01-11 05:30:01'), 5.1],
#       [Timestamp('2019-01-11 05:30:01'), 1.6]], dtype=object)
a[:,0] = a[:,0].astype('datetime64').astype(np.int64)
# array([[1547184601000000, nan],
#        [1547184601000000, 5.1],
#        [1547184601000000, 1.6]], dtype=object)

For this array a, I would like to convert the column 0 back to a pandas timestamp. As the array is quite big and my overall process quite time consuming, I would like to avoid the usage of python loops, applys, lambdas or similar things. Instead, I am looking for speed optimized native numpy based functions etc.

I tried already things like:

a[:,0].astype('datetime64')

(result: ValueError: Converting an integer to a NumPy datetime requires a specified unit)

and:

import calendar
calendar.timegm(a[:,0].utctimetuple())

(result: AttributeError: 'numpy.ndarray' object has no attribute 'utctimetuple')

How can I convert my column a[:,0] back to

array([[Timestamp('2019-01-11 05:30:01'), nan],
      [Timestamp('2019-01-11 05:30:01'), 5.1],
      [Timestamp('2019-01-11 05:30:01'), 1.6]], dtype=object)

in a speed optimized way?

Upvotes: 0

Views: 1005

Answers (1)

Frank AK
Frank AK

Reputation: 1781

Let's review docs

Immutable ndarray of datetime64 data, represented internally as int64, and which can be boxed to Timestamp objects that are subclasses of datetime and carry metadata such as frequency information.

So, we can use DatetimeIndex. and then covert it by using np.int64.

In [18]: b = a[:,0]                                                             

In [19]: index = pd.DatetimeIndex(b)

In [21]: index.astype(np.int64)                                                 
Out[21]: Int64Index([1547184601000000000, 1547184601000000000, 1547184601000000000], dtype='int64')

Upvotes: 1

Related Questions