hooliooo
hooliooo

Reputation: 548

Python Error Handling concerning datetime and time

I have this variable called pubdate which is derived from rss feeds. Most of the time it's a time tuple which is what I want it to be, so there are no errors.

Sometimes it's a unicode string, that's where it gets annoying.

So far, I have this following code concerning pubdate when it is a unicode string:

if isinstance(pubdate, unicode):
    try:
        pubdate = time.mktime(datetime.strptime(pubdate, '%d/%m/%Y %H:%M:%S').timetuple()) # turn the string into a unix timestamp
    except ValueError:
        pubdate = re.sub(r'\w+,\s*', '', pubdate) # removes day words from string, i.e 'Mon', 'Tue', etc.
        pubdate = time.mktime(datetime.strptime(pubdate, '%d %b %Y %H:%M:%S').timetuple()) # turn the string into a unix timestamp

But my problem is if the unicode string pubdate is in a different format from the one in the except ValueError clause it will raise another ValueError, what's the pythonic way to deal with multiple ValueError cases?

Upvotes: 0

Views: 26139

Answers (2)

Martin Evans
Martin Evans

Reputation: 46779

You could take the following approach:

from datetime import datetime
import time

pub_dates = ['2/5/2013 12:23:34', 'Monday 2 Jan 2013 12:23:34', 'mon 2 Jan 2013 12:23:34', '10/14/2015 11:11', '10 2015']

for pub_date in pub_dates:
    pubdate = 0     # value if all conversion attempts fail
    
    for date_format in ['%d/%m/%Y %H:%M:%S', '%d %b %Y %H:%M:%S', '%a %d %b %Y %H:%M:%S', '%A %d %b %Y %H:%M:%S', '%m/%d/%Y %H:%M']:
        try:
            pubdate = time.mktime(datetime.strptime(pub_date, date_format).timetuple()) # turn the string into a unix timestamp
            break
        except ValueError as e:
            pass
    
    print(f'{pubdate:<12}  {pub_date}')

Giving output as:

1367493814.0  2/5/2013 12:23:34
1357129414.0  Monday 2 Jan 2013 12:23:34
1357129414.0  mon 2 Jan 2013 12:23:34
1444817460.0  10/14/2015 11:11
0             10 2015

Upvotes: 2

luoluo
luoluo

Reputation: 5533

As you are parsing date string from a Rss. Maybe you need some guess when parsing the date string. I recommend you to use dateutil instead of the datetime module.

dateutil.parser offers a generic date/time string parser which is able to parse most known formats to represent a date and/or time.

The prototype of this function is: parse(timestr)(you don't have to specify the format yourself).

DEMO

>>> parse("2003-09-25T10:49:41")
datetime.datetime(2003, 9, 25, 10, 49, 41)

>>> parse("2003-09-25T10:49")
datetime.datetime(2003, 9, 25, 10, 49)

>>> parse("2003-09-25T10")
datetime.datetime(2003, 9, 25, 10, 0)

>>> parse("2003-09-25")
datetime.datetime(2003, 9, 25, 0, 0)

>>> parse("Sep 03", default=DEFAULT)
datetime.datetime(2003, 9, 3, 0, 0)

Fuzzy parsing:

>>> s = "Today is 25 of September of 2003, exactly " \
...     "at 10:49:41 with timezone -03:00."
>>> parse(s, fuzzy=True)
datetime.datetime(2003, 9, 25, 10, 49, 41,
              tzinfo=tzoffset(None, -10800))

Upvotes: 6

Related Questions