rookieOC
rookieOC

Reputation: 47

numpy array item from type object cannot be converted to numpy.datetime64

I am quite new to python and trying some features. I searched a lot in here for a possible solution with no success. This is my issue:

Goal:

The array output looks like this:

['2020-11-09 20:30:59' '2020-11-08 09:22:54' '2020-11-04 02:24:17' ...
 '1900-01-27 02:30:00' '1900-01-24 03:00:00' '1900-01-18 15:30:00']

I tried the following:

x[:, 0] = [np.datetime64(x[:, 0])]

--> this doesn't work; so I tried to convert to String before.

x[:, 0] = [np.datetime64(str(x[:, 0]))] 

--> This does also not work with this error message:

ValueError: Error parsing datetime string "['2020-11-09 20:30:59' '2020-11-08 09:22:54' '2020-11-04 02:24:17' ...
 '1900-01-27 02:30:00' '1900-01-24 03:00:00' '1900-01-18 15:30:00']" at position 0

So where is the issue?

Thanks a lot for the answer! It seems to work with the datetime64 conversion. I did change .

np.array(alist, 'datetime64[ms]')

to

np.array(alist, 'datetime64[ns]')

By that I get the datetime64 as an integer. Nevertheless, how comes that I get negative values for the early datesas seen in the output?

-----------------------------------------------------
--> Before conversion: ['2020-11-09 20:30:59' '2020-11-08 09:22:54' '2020-11-04 02:24:17' ...
 '1900-01-27 02:30:00' '1900-01-24 03:00:00' '1900-01-18 15:30:00']
-----------------------------------------------------
--> After conversion to dt64: [1604953859000000000 1604827374000000000 1604456657000000000 ...      
 -2206733400000000000 -2206990800000000000 -2207464200000000000]
-----------------------------------------------------

Upvotes: 1

Views: 1548

Answers (2)

FObersteiner
FObersteiner

Reputation: 25544

TLDR:

use astype for conversion to appropriate types:

import numpy as np

# array of dtype object (string)
arr = np.array(['2020-11-09 20:30:59', '2020-11-08 09:22:54', '2020-11-04 02:24:17',
                '1900-01-27 02:30:00', '1900-01-24 03:00:00', '1900-01-18 15:30:00'])

# to numpy's datetime, uses 'datetime64[s]' automatically
dtarr = arr.astype(np.datetime64)
# dtarr.dtype
# dtype('<M8[s]')

# to Unix time, using 1970-01-01 as epoch - values before the epoch are negative
dtarr_unix = dtarr.astype(np.int64) # datetime64[s] will give seconds since the epoch...
# array([ 1604953859,  1604827374,  1604456657, -2206733400, -2206990800,
#         -2207464200], dtype=int64)

# or using 1900-01-01 as epoch:
dtarr_1900 = (dtarr - np.datetime64('1900-01-01')).astype(np.int64)
# array([3813942659, 3813816174, 3813445457,    2255400,    1998000,
#           1524600], dtype=int64)

related: How to get unix timestamp from numpy.datetime64

Upvotes: 1

hpaulj
hpaulj

Reputation: 231325

I was going to say that datatime64 can be picky when converting strings. It can't handle every kind of delimiter. But

In [219]: alist = ['2020-11-09 20:30:59', '2020-11-08 09:22:54', '2020-11-04 02:24:17',
     ...:  '1900-01-27 02:30:00', '1900-01-24 03:00:00', '1900-01-18 15:30:00']
In [220]: alist
Out[220]: 
['2020-11-09 20:30:59',
 '2020-11-08 09:22:54',
 '2020-11-04 02:24:17',
 '1900-01-27 02:30:00',
 '1900-01-24 03:00:00',
 '1900-01-18 15:30:00']

it handles this format just fine:

In [221]: np.array(alist, 'datetime64[ms]')
Out[221]: 
array(['2020-11-09T20:30:59.000', '2020-11-08T09:22:54.000',
       '2020-11-04T02:24:17.000', '1900-01-27T02:30:00.000',
       '1900-01-24T03:00:00.000', '1900-01-18T15:30:00.000'],
      dtype='datetime64[ms]')

This display has a 'T' between date and time, but its lack is not a problem.

converted back to a list, it produces datatime objects:

In [222]: _.tolist()
Out[222]: 
[datetime.datetime(2020, 11, 9, 20, 30, 59),
 datetime.datetime(2020, 11, 8, 9, 22, 54),
 datetime.datetime(2020, 11, 4, 2, 24, 17),
 datetime.datetime(1900, 1, 27, 2, 30),
 datetime.datetime(1900, 1, 24, 3, 0),
 datetime.datetime(1900, 1, 18, 15, 30)]

Or just displaying seconds

In [223]: np.array(alist, 'datetime64[s]')
Out[223]: 
array(['2020-11-09T20:30:59', '2020-11-08T09:22:54',
       '2020-11-04T02:24:17', '1900-01-27T02:30:00',
       '1900-01-24T03:00:00', '1900-01-18T15:30:00'],
      dtype='datetime64[s]')

Parsing a single datatime string

In [235]: alist[0]
Out[235]: '2020-11-09 20:30:59'
In [236]: np.datetime64(alist[0])
Out[236]: numpy.datetime64('2020-11-09T20:30:59')

If it's an object dtype array (such as one might get from a dataframe):

In [254]: arr = np.array(alist, object)
In [255]: arr
Out[255]: 
array(['2020-11-09 20:30:59', '2020-11-08 09:22:54',
       '2020-11-04 02:24:17', '1900-01-27 02:30:00',
       '1900-01-24 03:00:00', '1900-01-18 15:30:00'], dtype=object)
In [256]: arr1 = arr.astype('datetime64[s]')
In [257]: arr1
Out[257]: 
array(['2020-11-09T20:30:59', '2020-11-08T09:22:54',
       '2020-11-04T02:24:17', '1900-01-27T02:30:00',
       '1900-01-24T03:00:00', '1900-01-18T15:30:00'],
      dtype='datetime64[s]')

Upvotes: 1

Related Questions