Pandas: Integer date not recognized with pd.to_datetime(x)

Question

In the given dataframe,

import pandas as pd
import numexpr as ne
op_d = {'ID': [1, 2,3],'V':['F','G','H'],'AAA':[0,1,1],'E':[20141223,20190201,20170203] ,'D':['2019/02/04','2019/02/01','2019/01/01'],'DD':['2019-12-01','2016-05-31','2015-02-15'],'CurrentRate':[7.5,2,2],'NoteRate':[2,3,3],'BBB':[0,00,4],'Q1':[2,8,00],'Q2':[3,5,7],'Q3':[5,6,8]}
df = pd.DataFrame(data=op_d)
df

if I do pd.to_datetime(df['E']) , it results in following:

0   1970-01-01 00:00:00.020141223
1   1970-01-01 00:00:00.020190201
2   1970-01-01 00:00:00.020170203
Name: E, dtype: datetime64[ns]

Is this expected behavior ? If this is expected then how can I detect date from Integer field? I know if dtype is object, I can place try except block on the columns and convert them to datetime format.

jezrael · Accepted Answer

Here is necessary specify parameter format - %Y%m%d means YYMMDD:

print (pd.to_datetime(df['E'], format='%Y%m%d'))
0   2014-12-23
1   2019-02-01
2   2017-02-03
Name: E, dtype: datetime64[ns]

Pandas: Integer date not recognized with pd.to_datetime(x)

Answers (1)

Related Questions