Reputation: 47
I am quite new to python and trying some features. I searched a lot in here for a possible solution with no success. This is my issue:
Goal:
read a column with date and time from an .xlsx file --> works
get this column in a numpy array --> works
The array item is read from the .xlsx as an Object
The array output looks like this:
['2020-11-09 20:30:59' '2020-11-08 09:22:54' '2020-11-04 02:24:17' ...
'1900-01-27 02:30:00' '1900-01-24 03:00:00' '1900-01-18 15:30:00']
I tried the following:
x[:, 0] = [np.datetime64(x[:, 0])]
--> this doesn't work; so I tried to convert to String before.
x[:, 0] = [np.datetime64(str(x[:, 0]))]
--> This does also not work with this error message:
ValueError: Error parsing datetime string "['2020-11-09 20:30:59' '2020-11-08 09:22:54' '2020-11-04 02:24:17' ...
'1900-01-27 02:30:00' '1900-01-24 03:00:00' '1900-01-18 15:30:00']" at position 0
Thanks a lot for the answer! It seems to work with the datetime64 conversion. I did change .
np.array(alist, 'datetime64[ms]')
to
np.array(alist, 'datetime64[ns]')
By that I get the datetime64 as an integer. Nevertheless, how comes that I get negative values for the early datesas seen in the output?
-----------------------------------------------------
--> Before conversion: ['2020-11-09 20:30:59' '2020-11-08 09:22:54' '2020-11-04 02:24:17' ...
'1900-01-27 02:30:00' '1900-01-24 03:00:00' '1900-01-18 15:30:00']
-----------------------------------------------------
--> After conversion to dt64: [1604953859000000000 1604827374000000000 1604456657000000000 ...
-2206733400000000000 -2206990800000000000 -2207464200000000000]
-----------------------------------------------------
Upvotes: 1
Views: 1548
Reputation: 25544
TLDR:
use astype
for conversion to appropriate types:
import numpy as np
# array of dtype object (string)
arr = np.array(['2020-11-09 20:30:59', '2020-11-08 09:22:54', '2020-11-04 02:24:17',
'1900-01-27 02:30:00', '1900-01-24 03:00:00', '1900-01-18 15:30:00'])
# to numpy's datetime, uses 'datetime64[s]' automatically
dtarr = arr.astype(np.datetime64)
# dtarr.dtype
# dtype('<M8[s]')
# to Unix time, using 1970-01-01 as epoch - values before the epoch are negative
dtarr_unix = dtarr.astype(np.int64) # datetime64[s] will give seconds since the epoch...
# array([ 1604953859, 1604827374, 1604456657, -2206733400, -2206990800,
# -2207464200], dtype=int64)
# or using 1900-01-01 as epoch:
dtarr_1900 = (dtarr - np.datetime64('1900-01-01')).astype(np.int64)
# array([3813942659, 3813816174, 3813445457, 2255400, 1998000,
# 1524600], dtype=int64)
related: How to get unix timestamp from numpy.datetime64
Upvotes: 1
Reputation: 231325
I was going to say that datatime64
can be picky when converting strings. It can't handle every kind of delimiter. But
In [219]: alist = ['2020-11-09 20:30:59', '2020-11-08 09:22:54', '2020-11-04 02:24:17',
...: '1900-01-27 02:30:00', '1900-01-24 03:00:00', '1900-01-18 15:30:00']
In [220]: alist
Out[220]:
['2020-11-09 20:30:59',
'2020-11-08 09:22:54',
'2020-11-04 02:24:17',
'1900-01-27 02:30:00',
'1900-01-24 03:00:00',
'1900-01-18 15:30:00']
it handles this format just fine:
In [221]: np.array(alist, 'datetime64[ms]')
Out[221]:
array(['2020-11-09T20:30:59.000', '2020-11-08T09:22:54.000',
'2020-11-04T02:24:17.000', '1900-01-27T02:30:00.000',
'1900-01-24T03:00:00.000', '1900-01-18T15:30:00.000'],
dtype='datetime64[ms]')
This display has a 'T' between date and time, but its lack is not a problem.
converted back to a list, it produces datatime
objects:
In [222]: _.tolist()
Out[222]:
[datetime.datetime(2020, 11, 9, 20, 30, 59),
datetime.datetime(2020, 11, 8, 9, 22, 54),
datetime.datetime(2020, 11, 4, 2, 24, 17),
datetime.datetime(1900, 1, 27, 2, 30),
datetime.datetime(1900, 1, 24, 3, 0),
datetime.datetime(1900, 1, 18, 15, 30)]
Or just displaying seconds
In [223]: np.array(alist, 'datetime64[s]')
Out[223]:
array(['2020-11-09T20:30:59', '2020-11-08T09:22:54',
'2020-11-04T02:24:17', '1900-01-27T02:30:00',
'1900-01-24T03:00:00', '1900-01-18T15:30:00'],
dtype='datetime64[s]')
Parsing a single datatime string
In [235]: alist[0]
Out[235]: '2020-11-09 20:30:59'
In [236]: np.datetime64(alist[0])
Out[236]: numpy.datetime64('2020-11-09T20:30:59')
If it's an object dtype array (such as one might get from a dataframe):
In [254]: arr = np.array(alist, object)
In [255]: arr
Out[255]:
array(['2020-11-09 20:30:59', '2020-11-08 09:22:54',
'2020-11-04 02:24:17', '1900-01-27 02:30:00',
'1900-01-24 03:00:00', '1900-01-18 15:30:00'], dtype=object)
In [256]: arr1 = arr.astype('datetime64[s]')
In [257]: arr1
Out[257]:
array(['2020-11-09T20:30:59', '2020-11-08T09:22:54',
'2020-11-04T02:24:17', '1900-01-27T02:30:00',
'1900-01-24T03:00:00', '1900-01-18T15:30:00'],
dtype='datetime64[s]')
Upvotes: 1