daniel402
daniel402

Reputation: 67

find timestamp by date_format with regexp

I'm thinking about a function, which is is able to find a timestamp in a logfile by passing a DATEFORMAT as an argument like:

def find_some_dates(logfile, timestamp_format='%d/%b/%Y %H:%M:%S.%f'):
    # find timestamps by timestamp_format
    # pass it to datetime.strptime
    # return unix timestamp

The timestamp can be anywhere inside a line. E.g.

[1] 17/Dec/2014 15:00:21.777 something happened
On 17/Dec/2014 15:00:21.777 something happened
17/Dec/2014 15:00:21.777 - something happened

I was thinking about some sort of mapping, which takes the timestamp_format and parses it into a regexp. Is there a better way to do it?

Upvotes: 2

Views: 153

Answers (1)

daniel402
daniel402

Reputation: 67

Allright, here's what i came up with. Assuming there's no other text in front of logfiles timestamp, I could use this

from datetime import datetime

line = "17/Dec/2014 15:00:21.777 something happened right here"

def find_some_dates(log_line, timestamp_format='%d/%b/%Y %H:%M:%S.%f'):
    try:
        date_str = datetime.strptime(log_line, timestamp_format)
    except ValueError as val: 
        print val.args[0].split(':').pop()

    # get substr with logfile timestamp and rerun the whole thing to convert to unix timestamp

find_some_dates(line)

Because this is not the case, i wrote a parser which loops over the given mappings and re.sub's timestamp_format

format_mapping = {('%a', '%A', '%B', '%b'): '[a-zA-Z]+',
                  ('%d', '%m', '%w', '%H', '%y', '%f', '%M', '%I', '%S', '%U', '%j'): '[0-9]+',
                   '%Z': '[A-Z]+'}

Upvotes: 1

Related Questions