HT121
HT121

Reputation: 451

datetime not converting uppercase month abbrevations

I have 1.6 million rows of data and I need to calculate difference between two dates. I have read the data from an excel file and then looked for the dtypes of columns. The date and time columns both have object datatypes. I need to convert these columns into datetime. I have tried it but it is only working for time but not for date. I have tried the following:

test = pd.read_excel(datadir+'test.xlsx', sheetname = 'Sheet1' )
test['time'] =  pd.to_datetime(test['time'], format='%H:%M:%S')
test['date'] =  pd.to_datetime(test['date'], format='%d%b%y') 

It does work for time but not for date and gives the following error: ValueError: unconverted data remains: 14

I have also tried without specifying format for date but then I get ValueError: Unknown string format

print(len(test['date'])) returns 7 print(test['date'].iloc[0]) returns 17JAN2014.

Here is the sample data: date time 17JAN2014 12:48 18JAN2014 13:15 17JAN2014 09:20 18JAN2014 07:45 04FEB2012 10:00 05FEB2012 17:22 12MAR2012 19:50

I have no clue what is going wrong. Any help will be appreciated. One more thing, Can anyone tells me how do I combine date and time columns into one column as a datetime?

Upvotes: 1

Views: 1304

Answers (1)

jezrael
jezrael

Reputation: 863301

There is problem with year, need Y for match YYYY, y match YY.
Check also http://strftime.org/.

df['date'] =  pd.to_datetime(df['date'], format='%d%b%Y')
print (df)
        date   time
0 2014-01-17  12:48
1 2014-01-18  13:15
2 2014-01-17  09:20
3 2014-01-18  07:45
4 2012-02-04  10:00
5 2012-02-05  17:22
6 2012-03-12  19:50

Upvotes: 1

Related Questions