Reputation: 5117
Let's suppose that I am running the following lines in python
and pandas
:
# Load data
data = pd.read_csv('C:/Users/user/Desktop/data.txt',\
keep_default_na=True, sep='\t', na_values='?')
# Convert to datetime column
data['Date'] = pd.to_datetime(data['Date'], errors='raise', dayfirst=True)
However, I want to see all the data of this column raising exceptions in pandas.
For this reason I wrote this:
exceptions = []
for index, row in data.iterrows():
try:
row['PICKUP_DT'] = pd.to_datetime(row['PICKUP_DT'], errors='raise', dayfirst=True)
except:
exceptions.append(row['PICKUP_DT'])
dataframe = pd.DataFrame({'Exceptions': exceptions})
dataframe.to_csv('C:/Users/user/Desktop/EXCEPTIONS.csv', index=False, na_rep='NA')
Is there any better way to do this?
I actually thought that there would be an in-built pandas
way to do this.
Upvotes: 0
Views: 321
Reputation: 59549
Use .loc
to get all of the problematic rows checking .isnull()
for the result with errors='coerce'
. I exclude NaN
as pd.to_datetime
will not raise an error for null values.
import pandas as pd
import numpy a np
data = pd.DataFrame({'Date': [np.NaN, '12-03-2019', '001111231', '46-06-1988']})
# Date
#0 NaN
#1 12-03-2019
#2 001111231
#3 46-06-1988
data.loc[pd.to_datetime(data.Date, errors='coerce', dayfirst=True).isnull()
& data.Date.notnull(), 'Date']
#2 001111231
#3 46-06-1988
#Name: Date, dtype: object
Upvotes: 1