Reputation: 20352
How can I find non-dates in a specific field in a dataframe? I have dates that look like this:
20260521
20201207
20230130
When I look at the datatypes this is listed as 'object'. I made a few attempts to coerce these items into actual dates, using the following code.
This, actually does nothing.
df['MATURITY'] = pd.to_datetime(df['MATURITY'], errors='ignore')
The following two attempts throw an error.
df['MATURITY'] = df['MATURITY'].astype('datetime64[ns]')
df['MATURITY'] = pd.to_datetime(df.MATURITY)
Error Message:
ValueError: ('String does not contain a date:', ' ')
I can probably just delete these records, but I would like to see what they look like, before arbitrarily deleting them. So, I have two questions:
1) How can I view these specific records, in the field named 'MATURITY' which are throwing errors?
2) Also, if I want to delete these specific records, how can I do it?
Upvotes: 0
Views: 1069
Reputation: 59284
As per docs
, the argument errors
in pd.to_datetime
can be:
errors : {‘ignore’, ‘raise’, ‘coerce’}, default ‘raise’
If ‘raise’, then invalid parsing will raise an exception
If ‘coerce’, then invalid parsing will be set as NaT
If ‘ignore’, then invalid parsing will return the input
Hence, you should use errors='coerce'
and work with the NaN
s
df['MATURITY'] = pd.to_datetime(df['MATURITY'], errors='coerce')
1) How can I view these specific records, in the field named 'MATURITY' which are throwing errors?
df.loc[df['MATURITY'].isnull()]
2) Also, if I want to delete these specific records, how can I do it?
df.dropna(subset='MATURITY')
Upvotes: 4