Reputation: 640

faster way to handle time string with python

I have many log files with the format like:

2012-09-12 23:12:00 other logs here

and i need to extract the time string and compare the time delta between two log records. I did that with this:

for line in log:
    l = line.strip().split()
    timelist = [int(n) for n in re.split("[- :]", l[0]+' ' + l[1])]
    #now the timelist looks like [2012,9,12,23,12,0]

Then when i got two records

d1 = datetime.datetime(timelist1[0], timelist1[1], timelist1[2], timelist1[3], timelist1[4], timelist1[5])
d2 = datetime.datetime(timelist2[0], timelist2[1], timelist2[2], timelist2[3], timelist2[4], timelist2[5])
delta = (d2-d1).seconds

The problem is it runs slowly,is there anyway to improve the performance?Thanks in advance.

Upvotes: 1

Answers (3)

Pierre GM

Reputation: 20339

You can also try without regexp, using the optional argument of split

(date, time, log) = line.split(" ", 2)
timerecord = datetime.datetime.strptime(date+" "+time, "%Y-%m-%d %H:%M:%S")

and then it'd be a matter of computing your timedeltas between consecutive timerecords

Upvotes: 1

grc

Reputation: 23575

You could do it entirely with regular expressions, which might be faster.

find_time = re.compile("^(\d{4})-(\d{2})-(\d{2}) (\d{2}):(\d{2}):(\d{2})")

for line in log:
    timelist = find_time.match(line)
    if timelist:
        d = datetime.datetime(*map(int, timelist.groups()))

Upvotes: 1

Blender

Reputation: 298276

You could get rid of the regex and use map:

date_time = datetime.datetime

for line in log:
    date, time = line.strip().split(' ', 2)[:2]

    timelist = map(int, date.split('-') + time.split(':'))
    d = date_time(*timelist)

I think .split(' ', 2) will be faster than just .split() because it only splits up to two times and only on spaces, not on any whitespace.
map(int, l) is faster than [int(x) for x in l] the last time I checked.
If you can, get rid of .strip().

Upvotes: 1

faster way to handle time string with python

Answers (3)

Related Questions