Reputation: 63
I have read in two worksheets separately into pandas dataframes. Both have date issued columns.
In both worksheets the column is held in excel as dd/mm/yyyy.
ws1
13/02/2017
01/02/2017
08/11/2016
05/08/2016
16/03/2017
53 2017-02-13
51 2017-02-01
22 2016-11-08
0 2016-08-05
63 2017-03-16
Name: Date Issued, dtype: datetime64[ns]
But ws2
08/03/2017
24/08/2016
28/11/2016
26/10/2016
10/03/2017
0 2017-03-08 00:00:00
1 2016-08-24 00:00:00
2 2016-11-28 00:00:00
3 2016-10-26 00:00:00
4 2017-03-10 00:00:00
Name: Date Issued, dtype: object
Why is it that the dtypes are different and how can I apply remove the time?
The code currently looks like nothing out the ordinary for dealing with pandas
df = pd.read_excel(file, 'ws2')
df = df.loc[:, ['Date Issued', 'Person ID',
'First Name', 'Surname', 'Type', 'Amount']]
df = df.sort_values(by=['Surname'])
df['Date Issued'] = pd.to_datetime(df_loan['Date Issued'], dayfirst=True)
I have tried to use;
df['Date Issued'] = pd.to_datetime(df['Date Issued'], dayfirst=True)
But get the following error;
TypeError: invalid string coercion to datetime
Also;
df['Date Issued'] = df['Date Issued'].astype('datetime64[ns]')
But getting this error;
ValueError: Error parsing datetime string " " at position 1
Upvotes: 1
Views: 500
Reputation: 862801
It seems there is at least one non datetime value.
So need parameter errors='coerce'
for convert these values to NaT
(NaN
for datetimes) in to_datetime
:
df['Date Issued'] = pd.to_datetime(df['Date Issued'], dayfirst=True, errors='coerce')
Upvotes: 1