Convert column with strings and missings to DateTime

Question

I have a dataframe with a number of dates (dtype: object) in a string format. However, from time to time, entries are missing (NaN) or they do not contain a string with a date. Here is an example:

data = [{"date_string" : "01:01:2019 00:00:00"}, {"date_string" : " "}, {"date_string" : np.NaN}]
df = pd.DataFrame(data)

Converting the string to DateTime throws an error:

df.date_string = df.date_string.apply(pd.to_datetime)


ValueError: String does not contain a date.

Is there a way to bypass entries missing a date?

anky · Accepted Answer

the docs for pd.to_datetime says:

errors : {‘ignore’, ‘raise’, ‘coerce’}, default ‘raise’
If ‘raise’, then invalid parsing will raise an exception
If ‘coerce’, then invalid parsing will be set as NaT
If ‘ignore’, then invalid parsing will return the input

So you should pass errors='coerce' to bypass the invalid datetime values:

df.date_string.apply(pd.to_datetime,errors='coerce')

Or without apply for a single column:

pd.to_datetime(df.date_string,errors='coerce')

0   2019-07-22
1          NaT
2          NaT
Name: date_string, dtype: datetime64[ns]

Convert column with strings and missings to DateTime

Answers (1)

Related Questions