Reputation: 11
I am working with big databases that have some dates. Currently, the dates are strings in this format dd/mm/YYYY
so I use pd.to_datetime()
. It works for almost every table but I have a few there is not working because some of the dates are wrong. For example, instead of '1999' it is '0199'. Because of this, the output was "out of bounds nanosecond timestamp"
Since the errors don't follow a pattern and I don't want to exclude the rows manually what should I do to convert the rows that are correct and ignore the ones that raise errors?
Upvotes: 1
Views: 792
Reputation: 731
You can have pandas ignore the values it can't figure out and just set them to NaT
. reference
example:
sanitized_dates = pd.to_datetime(dates, errors='coerce')
If you desire more specific handling, you may wish to write your own function and use a Series.apply()
to handle any specific corrections.
Upvotes: 1