Rachel
Rachel

Reputation: 1967

Convert column with strings and missings to DateTime

I have a dataframe with a number of dates (dtype: object) in a string format. However, from time to time, entries are missing (NaN) or they do not contain a string with a date. Here is an example:

data = [{"date_string" : "01:01:2019 00:00:00"}, {"date_string" : " "}, {"date_string" : np.NaN}]
df = pd.DataFrame(data)

Converting the string to DateTime throws an error:

df.date_string = df.date_string.apply(pd.to_datetime)


ValueError: String does not contain a date.

Is there a way to bypass entries missing a date?

Upvotes: 0

Views: 436

Answers (1)

anky
anky

Reputation: 75090

the docs for pd.to_datetime says:

errors : {‘ignore’, ‘raise’, ‘coerce’}, default ‘raise’
If ‘raise’, then invalid parsing will raise an exception
If ‘coerce’, then invalid parsing will be set as NaT
If ‘ignore’, then invalid parsing will return the input

So you should pass errors='coerce' to bypass the invalid datetime values:

df.date_string.apply(pd.to_datetime,errors='coerce')

Or without apply for a single column:

pd.to_datetime(df.date_string,errors='coerce')

0   2019-07-22
1          NaT
2          NaT
Name: date_string, dtype: datetime64[ns]

Upvotes: 2

Related Questions