Reputation: 548
I have this variable called pubdate
which is derived from rss feeds. Most of the time it's a time tuple which is what I want it to be, so there are no errors.
Sometimes it's a unicode string, that's where it gets annoying.
So far, I have this following code concerning pubdate
when it is a unicode string:
if isinstance(pubdate, unicode):
try:
pubdate = time.mktime(datetime.strptime(pubdate, '%d/%m/%Y %H:%M:%S').timetuple()) # turn the string into a unix timestamp
except ValueError:
pubdate = re.sub(r'\w+,\s*', '', pubdate) # removes day words from string, i.e 'Mon', 'Tue', etc.
pubdate = time.mktime(datetime.strptime(pubdate, '%d %b %Y %H:%M:%S').timetuple()) # turn the string into a unix timestamp
But my problem is if the unicode string pubdate
is in a different format from the one in the except ValueError
clause it will raise another ValueError
, what's the pythonic way to deal with multiple ValueError
cases?
Upvotes: 0
Views: 26139
Reputation: 46779
You could take the following approach:
from datetime import datetime
import time
pub_dates = ['2/5/2013 12:23:34', 'Monday 2 Jan 2013 12:23:34', 'mon 2 Jan 2013 12:23:34', '10/14/2015 11:11', '10 2015']
for pub_date in pub_dates:
pubdate = 0 # value if all conversion attempts fail
for date_format in ['%d/%m/%Y %H:%M:%S', '%d %b %Y %H:%M:%S', '%a %d %b %Y %H:%M:%S', '%A %d %b %Y %H:%M:%S', '%m/%d/%Y %H:%M']:
try:
pubdate = time.mktime(datetime.strptime(pub_date, date_format).timetuple()) # turn the string into a unix timestamp
break
except ValueError as e:
pass
print(f'{pubdate:<12} {pub_date}')
Giving output as:
1367493814.0 2/5/2013 12:23:34
1357129414.0 Monday 2 Jan 2013 12:23:34
1357129414.0 mon 2 Jan 2013 12:23:34
1444817460.0 10/14/2015 11:11
0 10 2015
Upvotes: 2
Reputation: 5533
As you are parsing date string from a Rss. Maybe you need some guess when parsing the date string. I recommend you to use dateutil instead of the datetime module.
dateutil.parser
offers a generic date/time string parser which is able to parse most known formats to represent a date and/or time.
The prototype of this function is: parse(timestr)
(you don't have to specify the format yourself).
DEMO
>>> parse("2003-09-25T10:49:41")
datetime.datetime(2003, 9, 25, 10, 49, 41)
>>> parse("2003-09-25T10:49")
datetime.datetime(2003, 9, 25, 10, 49)
>>> parse("2003-09-25T10")
datetime.datetime(2003, 9, 25, 10, 0)
>>> parse("2003-09-25")
datetime.datetime(2003, 9, 25, 0, 0)
>>> parse("Sep 03", default=DEFAULT)
datetime.datetime(2003, 9, 3, 0, 0)
Fuzzy parsing:
>>> s = "Today is 25 of September of 2003, exactly " \
... "at 10:49:41 with timezone -03:00."
>>> parse(s, fuzzy=True)
datetime.datetime(2003, 9, 25, 10, 49, 41,
tzinfo=tzoffset(None, -10800))
Upvotes: 6