epic_fil
epic_fil

Reputation: 679

Python strptime or alternative for complex date string parsing

I have been given a large list of date-time representations that need to be read into a database. I am using Python (because it rocks). The strings are in a terrible, terrible format where they are not precise to seconds, no timezone is stated, and the hours do not have a leading 0. So they look more like this:

April 29, 2013, 7:52 p.m.
April 30, 2013, 4 p.m.

You'll notice that if something happens between 4:00 and 4:01 it drops the minutes, too (ugh). Anyway, trying to parse these with time.strptime, but the docs state that hours must be decimal numbers [01:12] (or [01:24]). Since nothing is padded with 0's I'm wondering if there is something else I can pass to strptime to accept hours without leading 0; or if I should try splitting, then padding the strings; or use some other method of constructing the datetime object.

Also, it does not look like strptime accepts AM/PM as "A.M." or "P.M.", so I'll have to correct that as well. . .

Note, I am not able to just handle these strings in a batch. I receive them one-at-a-time from a foreign application which sometimes uses nicely formatted Unix epoch timestamps, but occasionally uses this format. Processing them on the fly is the only option.

I am using Python 2.7 with some Python 3 features imported.

from __future__ import (print_function, unicode_literals)

Upvotes: 1

Views: 3954

Answers (1)

Martijn Pieters
Martijn Pieters

Reputation: 1121406

The most flexible parser is part of the dateutil package; it eats your input for breakfast:

>>> from dateutil import parser
>>> parser.parse('April 29, 2013, 7:52 p.m.')
datetime.datetime(2013, 4, 29, 19, 52)
>>> parser.parse('April 30, 2013, 4 p.m.')
datetime.datetime(2013, 4, 30, 16, 0)

Upvotes: 16

Related Questions