blue note
blue note

Reputation: 29071

python dateutil.parser wrong (??) parsing

I am trying the following (python 3.6)

import dateutil.parser as dp
t1 = '0001-04-23T02:25:43.511Z'
t2 = '0001-04-23T01:25:43.511Z'
print(dp.parse(t1))
print(dp.parse(t2))

which gives me

0001-04-23 02:25:43.511000+00:00
0023-01-04 01:25:43.511000+00:00

In various similar cases, when the year string has form 00XY and the hour string XY, the parser seems to produce the wrong output. Am I missing something, or is this a bug?

Upvotes: 4

Views: 8530

Answers (1)

Paul
Paul

Reputation: 10863

This was a bug in dateutil that was fixed (initial work here, but this specific edge case was fixed here). Using python-dateutil>=2.7.0 will fix your issue.

import dateutil
import dateutil.parser as dp

print(dateutil.__version__)
# 2.7.2

t1 = '0001-04-23T02:25:43.511Z'
t2 = '0001-04-23T01:25:43.511Z'

print(dp.parse(t1))
0001-04-23 02:25:43.511000+00:00

print(dp.parse(t2))
0001-04-23 01:25:43.511000+00:00

I do not recommend using yearfirst as it has other effects on how your datetime strings are parsed, and it is essentially an implementation detail that it works at all in the buggy case (since the bug involves interpreting 0001 as being equivalent to 01, which it is not).

If you do know that you have an ISO-8601 formatted datetime, dateutil.parser.isoparse will be faster and stricter, and does not have this bug. It was also introduced in version 2.7.0:

print(isoparse('0001-04-23T02:25:43.511Z'))
# 0001-04-23 02:25:43.511000+00:00

print(isoparse('0001-04-23T01:25:43.511Z'))
# 0001-04-23 01:25:43.511000+00:00

Upvotes: 1

Related Questions