ASH
ASH

Reputation: 20352

Find non-dates in a dataframe

How can I find non-dates in a specific field in a dataframe? I have dates that look like this:

20260521
20201207
20230130

When I look at the datatypes this is listed as 'object'. I made a few attempts to coerce these items into actual dates, using the following code.

This, actually does nothing.

df['MATURITY'] = pd.to_datetime(df['MATURITY'], errors='ignore')

The following two attempts throw an error.

df['MATURITY'] = df['MATURITY'].astype('datetime64[ns]')
df['MATURITY'] = pd.to_datetime(df.MATURITY)

Error Message:

ValueError: ('String does not contain a date:', ' ')

I can probably just delete these records, but I would like to see what they look like, before arbitrarily deleting them. So, I have two questions:

1) How can I view these specific records, in the field named 'MATURITY' which are throwing errors?

2) Also, if I want to delete these specific records, how can I do it?

Upvotes: 0

Views: 1069

Answers (1)

rafaelc
rafaelc

Reputation: 59284

As per docs, the argument errors in pd.to_datetime can be:

errors : {‘ignore’, ‘raise’, ‘coerce’}, default ‘raise’

If ‘raise’, then invalid parsing will raise an exception

If ‘coerce’, then invalid parsing will be set as NaT

If ‘ignore’, then invalid parsing will return the input

Hence, you should use errors='coerce' and work with the NaNs

df['MATURITY'] = pd.to_datetime(df['MATURITY'], errors='coerce')

1) How can I view these specific records, in the field named 'MATURITY' which are throwing errors?

df.loc[df['MATURITY'].isnull()]

2) Also, if I want to delete these specific records, how can I do it?

df.dropna(subset='MATURITY')

Upvotes: 4

Related Questions