Reputation: 6345
I'm getting a value error saying my data does not match the format when it does. Not sure if this is a bug or I'm missing something here. I'm referring to this documentation for the string format. The weird part is if I write the 'data' Dataframe to a csv and read it in then call the function below it will convert the date so I'm not sure why it doesn't work without writing to a csv.
Any ideas?
data['Date'] = pd.to_datetime(data['Date'], format='%d-%b-%Y')
I'm getting two errors
TypeError: Unrecognized value type: <class 'str'>
ValueError: time data '27‑Aug‑2018' does not match format '%d-%b-%Y' (match)
Example dates -
2‑Jul‑2018
27‑Aug‑2018
28‑May‑2018
19‑Jun‑2017
5‑Mar‑2018
15‑Jan‑2018
11‑Nov‑2013
23‑Nov‑2015
23‑Jun‑2014
18‑Jun‑2018
30‑Apr‑2018
14‑May‑2018
16‑Apr‑2018
26‑Feb‑2018
19‑Mar‑2018
29‑Jun‑2015
Is it because they all aren't double digit days? What is the string format value for single digit days? Looks like this could be the cause but I'm not sure why it would error on the '27' though.
End solution (It was unicode & not a string) -
data['Date'] = data['Date'].apply(unidecode.unidecode)
data['Date'] = data['Date'].apply(lambda x: x.replace("-", "/"))
data['Date'] = pd.to_datetime(data['Date'], format="%d/%b/%Y")
Upvotes: 1
Views: 11433
Reputation: 323226
You got a special mark here it is not -
df.iloc[0,0][2]
Out[287]: '‑'
Replace it with '-'
pd.to_datetime(df.iloc[:,0].str.replace('‑','-'),format='%d-%b-%Y')
Out[288]:
0 2018-08-27
1 2018-05-28
2 2017-06-19
3 2018-03-05
4 2018-01-15
5 2013-11-11
6 2015-11-23
7 2014-06-23
8 2018-06-18
9 2018-04-30
10 2018-05-14
11 2018-04-16
12 2018-02-26
13 2018-03-19
14 2015-06-29
Name: 2‑Jul‑2018, dtype: datetime64[ns]
Upvotes: 1
Reputation: 6091
There seems to be an issue with your date strings. I replicated your issue with your sample data and if I remove the hyphens and replace them manually (for the first three dates) then the code works
pd.to_datetime(df1['Date'] ,errors ='coerce')
output:
0 2018-07-02
1 2018-08-27
2 2018-05-28
3 NaT
4 NaT
5 NaT
6 NaT
7 NaT
8 NaT
9 NaT
10 NaT
11 NaT
12 NaT
13 NaT
14 NaT
15 NaT
Bottom line: your hyphens look like regular ones but are actually something else, just clean your source data and you're good to go
Upvotes: 2