Doug Fir
Doug Fir

Reputation: 21282

How to handle a timestamp field where the off set is out of the acceptable bounds

I have a field in a data frame with is ISO time with offset:

pages[['dimension1', 'dimension3']].head()
               dimension1                     dimension3
  1572461291083.sanyrqy8   2019-10-30T14:45:42.71-04:00

Most of the rows are fine except some have an off set outside 24 hours.

x = pd.to_datetime(pages.dimension3)

ValueError: offset must be a timedelta strictly between -timedelta(hours=24) and timedelta(hours=24).

Here's an example of a rogue data point that's causing this error:

2019-11-11T07:08:09.640-31:00

My current task is not to solve why the data exists in this way but simply to get the raw data into a Postgres.

Is there some kind of if else logic I can use on this field to tell pandas when using to_datetime() if the offset is larger than 24 then to change it to 24? This would alter the rogue example above to be 2019-11-11T07:08:09.640-24:00

How could I do that with Pandas?

Upvotes: 1

Views: 1320

Answers (1)

oppressionslayer
oppressionslayer

Reputation: 7224

Use dateutil. It's great for parsing dates that give errors

import dateutil                                                                                                                                                                     

dateutil.parser.parse('2019-10-30T14:45:42.71-04:00')                                                                                                                               
# datetime.datetime(2019, 10, 30, 14, 45, 42, 710000, tzinfo=tzoffset(None, -14400))

dateutil.parser.parse('2019-11-11T07:08:09.640-31:00')                                                                                                                              
# datetime.datetime(2019, 11, 11, 7, 8, 9, 640000, tzinfo=tzoffset(None, -111600))

Upvotes: 1

Related Questions