Reputation: 43
I have written this date parsing function
def date_parser(string):
try:
date = pd.datetime.strptime(string, "%d/%m/%Y")
except:
date = pd.NaT
return date
and I call it in pd.read_csv like this
df = pd.read_csv(os.path.join(path, file),
sep=";",
encoding="latin-1",
keep_default_na=False,
na_values=na_values,
index_col=False,
usecols=keep,
dtype=dtype,
date_parser=date_parser,
parse_dates=dates)
The problem is that in one of my dates column, I end up with mixed data types
df[data].apply(type).value_counts()
I should only have the last two right?
Upvotes: 1
Views: 722
Reputation: 862451
I suggest change your function by to_datetime
with errors='coerce'
for return NaT
if not matched format %d/%m/%Y
:
def date_parser(string):
return pd.to_datetime(string, format="%d/%m/%Y", errors='coerce')
Upvotes: 3