Reputation: 1967
I have a dataframe with a number of dates (dtype: object) in a string format. However, from time to time, entries are missing (NaN) or they do not contain a string with a date. Here is an example:
data = [{"date_string" : "01:01:2019 00:00:00"}, {"date_string" : " "}, {"date_string" : np.NaN}]
df = pd.DataFrame(data)
Converting the string to DateTime throws an error:
df.date_string = df.date_string.apply(pd.to_datetime)
ValueError: String does not contain a date.
Is there a way to bypass entries missing a date?
Upvotes: 0
Views: 436
Reputation: 75090
the docs for pd.to_datetime
says:
errors : {‘ignore’, ‘raise’, ‘coerce’}, default ‘raise’ If ‘raise’, then invalid parsing will raise an exception If ‘coerce’, then invalid parsing will be set as NaT If ‘ignore’, then invalid parsing will return the input
So you should pass errors='coerce'
to bypass the invalid datetime values:
df.date_string.apply(pd.to_datetime,errors='coerce')
Or without apply for a single column:
pd.to_datetime(df.date_string,errors='coerce')
0 2019-07-22
1 NaT
2 NaT
Name: date_string, dtype: datetime64[ns]
Upvotes: 2