DmVinny
DmVinny

Reputation: 173

Get year from unknown date format using python

So I am querying a server for specific data, and I need to extract the year, from the date field returned back, however the date field varies for example:

2009
2009-10-8
2009-10
2017-10-22
2017-10

The obvious would be to extract the date into a array and fetch the max: (but there is a problem)

year = max(d.split('-'))

for some reason this gives out false positives as 22 seems to be max verses 2017, also if future calls to the server result in the date being stored as "2019/10/20" this will bring forth issues as well.

Upvotes: 0

Views: 147

Answers (2)

RoadRunner
RoadRunner

Reputation: 26315

I would use the python-dateutil library to easily extract the year from a date string:

from dateutil.parser import parse

dates = ['2009', '2009-10-8', '2009-10']

for date in dates:
    print(parse(date).year)

Output:

2009
2009
2009

Upvotes: 2

Thomas
Thomas

Reputation: 181725

The problem is that, while 2017 > 22, '2017' < '22' because it's a string comparison. You could do this to resolve that:

year = max(map(int, d.split('-')))

But instead, if you don't mind being frowned upon by the Long Now Foundation, consider using a regular expression to extract any 4-digit number:

match = re.search(r'\b\d{4}\b', d)
if match:
    year = int(match.group(0))

Upvotes: 4

Related Questions