Zlo
Zlo

Reputation: 1170

Switching between date format in a for loop; Python

The date in my data is stored in two different formats:

Dienstag 31. Dezember 2013 and 30. Juni 2007

I wrote scripts to extract Year/Month/Day from both formats and store them in a list:

for row in reader:
    line_count = line_count + 1
    if row[1] == "DATE":
        pass
    else:
        date = row[1].encode('utf-8')
        year = date.split('.')[1].split(" ")[2]
        day = date.split(" ")[0]
        day = day.replace('.', '')
        month = date.split('.')[1].split(' ')[1]

for the first format

and

date = row[1].encode('utf-8')
year = date.split('.')[1].split(" ")[2]
day = date.split(" ")[0]
day = day.replace('.', '')
month = date.split('.')[1].split(' ')[1]

for the second format

However these date formats are randomly occurring throughout the dataset (row[1]). Is there a way to tell Python when it encounters one of the formats to use the respective script (like an if statement)? Thanks.

Upvotes: 0

Views: 99

Answers (4)

Iron Fist
Iron Fist

Reputation: 10951

Another approach with regex, just to give you more options:

import re

if (re.search('^[a-zA-Z]',date):
    #Method for First Format
else:
    #Method for Second Format

Upvotes: 0

Aditya
Aditya

Reputation: 3158

Don't know if there's a compulsion on you but Regular Expressions are more suitable for a problem of this kind. The best part is, it is very robust yet flexible -> you can easily make modifications if you expect more formats (maybe American style like January 31, 2004). Five lines of code rather than original 15 ;)

Here's the code:

import re

reg_date = "(Montag|Dienstag|Mittwoch|Donnerstag|Freitag|Samstag|Sonntag)*\s*(\d{1,2})\.\s+(\w{3,12})\s(\d{2,4})"

def extract_date(string):
    results = re.search(reg_date, string)
    if results:
        date = results.groups()
        return date[1], date[2], date[3] 

And to use this, simply write a line like:

day,month,year = extract_date("Dienstag 31. Dezember 2013 and ")
print day,month,year

or another experiment with the second format

day,month,year = extract_date("31. May 2013 ")
print day,month,year

enter image description here

Simple, Elegant, Reusable.

Upvotes: 2

alexisdevarennes
alexisdevarennes

Reputation: 5642

You can check if the first character in the string is alpha.

if date[0].isalpha():
    # call your function for German dates here
else:
    # call the other function

Upvotes: 1

The6thSense
The6thSense

Reputation: 8335

If any only if the second pattern starts with a number

if (date[0].isdigit()):

      ***method for pattern2***
else:

      ***method for pattern1***

Upvotes: 2

Related Questions