codyc4321
codyc4321

Reputation: 9682

normalizing JSON datestrings to UTC python

I have an important test that says "Calculate users that logged in during the month of April normalized to the UTC timezone."

Items look as such:

[ {u'email': u' [email protected]',
  u'login_date': u'2014-05-08T22:30:57-04:00'},
 {u'email': u'[email protected]',
  u'login_date': u'2014-04-25T13:27:48-08:00'},
]

It seems to me that an item like 2014-04-13T17:12:20-04:00 means "April 13th, 2014, at 5:12:20 pm, 4 hours behind UTC". Then I just use strptime to convert to datetime (Converting JSON date string to python datetime), and subtract a timedelta of however many hours I get from a regex that grabs the end of string? I feel this way because some have a + at the end instead of -, like 2014-05-07T00:30:06+07:00

Thank you

Upvotes: 4

Views: 897

Answers (3)

uhriab
uhriab

Reputation: 21

You can use arrow to easily parse date with time zone.

>>>import arrow
>>> a = arrow.get('2014-05-08T22:30:57-04:00').to('utc')
>>> a
<Arrow [2014-05-09T02:30:57+00:00]>

Get a datetime object or timestamp:

>>> a.datetime
datetime.datetime(2014, 5, 9, 2, 30, 57, tzinfo=tzutc())
>>> a.naive
datetime.datetime(2014, 5, 9, 2, 30, 57)
>>> a.timestamp
1399602657

Upvotes: 2

Marshall Farrier
Marshall Farrier

Reputation: 967

The following solution should be faster and avoids importing external libraries. The downside is that it will only work if the date strings are all guaranteed to have the specified format. If that's not the case, then I would prefer Simeon's solution, which lets dateutil.parser.parse() take care of any inconsistencies.

import datetime as dt

def parse_date(datestr):
    diff = dt.timedelta(hours=int(datestr[20:22]), minutes=int(datestr[23:]))
    if datestr[19] == '-':
        return dt.datetime.strptime(datestr[:19], '%Y-%m-%dT%H:%M:%S') - diff
    return dt.datetime.strptime(datestr[:19], '%Y-%m-%dT%H:%M:%S') + diff

Upvotes: 1

Simeon Visser
Simeon Visser

Reputation: 122486

It is probably best to use the dateutil.parser.parse and pytz packages for this purpose. This will allow you to parse a string and convert it to a datetime object with UTC timezone:

>>> s = '2014-05-08T22:30:57-04:00'
>>> import dateutil.parser
>>> import pytz
>>> pytz.UTC.normalize(dateutil.parser.parse(s))
datetime.datetime(2014, 5, 9, 2, 30, 57, tzinfo=<UTC>)

Upvotes: 3

Related Questions