Reputation: 423
I am working on a dataset for machine learning but I have an error for the date that not matching. I am tried both times with different strings in format "%d-%m-%y"
, "%d/%m/%y"
but it is not worked for me. What can I do so that problem will solve. What can I do as dataset dates are in a different format?
df_MR['Date'] = pd.to_datetime(df_MR['Date'], format = "%d-%m-%y")```
ValueError: time data '30/01/20' does not match format '%d-%m-%y' (match)
df_MR['Date'] = pd.to_datetime(df_MR['Date'], format = "%d/%m/%y")```
ValueError: time data '02-01-2020' does not match format '%d/%m/%y' (match)
Upvotes: 1
Views: 5820
Reputation: 42247
What can I do as in dataset dates are in different format ?
try: # try to parse 4 digit years
df_MR['Date'] = pd.to_datetime(df_MR['Date'], format = "%d-%m-%Y")
except ValueError: # fallback to 2 digits year
df_MR['Date'] = pd.to_datetime(df_MR['Date'], format = "%d/%m/%y")
One more alternative is to not pass in a format at all, and hope that pandas will get it right. Since both your date formats aren in DMY order, you could try pd.to_datetime(dt, dayfirst=True)
.
Upvotes: 0
Reputation: 8942
I've had some success using the infer_datetime_format argument of to_datetime in a small example:
>>> df = pd.DataFrame({'a': ['02-01-2020', '03-02-20', '03/02/2020', '04/05/2020']})
>>> pd.to_datetime(df['a'], infer_datetime_format=True)
0 2020-02-01
1 2020-03-02
2 2020-03-02
3 2020-04-05
Name: a, dtype: datetime64[ns]
Upvotes: 5