split string to be inserted into database

Question

I have a text file with a lot of such lines.

Jul 15 12:12:51 whitelist logger: 1|999999999999|id:d9faff7c-4016-4343-b494-37028763bb66 submit date:1307130919 done date:1307130919 stat:DELIVRD err:0|L_VB3_NM_K_P|1373687445|vivnel2|L_VB3_GH_K_P|promo_camp1-bd153424349bc647|1

I need to insert the values in the database and hence I need to separate the values.

1) logger
2) submit date
3) done date
4) stat
5) err

The following is working to isolate the logger string.

tail  messages | grep logger: | awk -F'logger: ' '{print $2}' | awk '{print $1}'

Is it the right way to divide a string? Any better option available?

calmrat · Accepted Answer

There are many ways to accomplish this in Python. One simple approach is to use Python's built in regular expressions. Assuming the log output always follows the rules mentioned, you could extract the parts of interest like this:

import re

s = "Jul 15 12:12:51 whitelist logger: 1|999999999999|id:d9faff7c-4016-4343-b494-37028763bb66 submit date:1307130919 done date:1307130919 stat:DELIVRD err:0|L_VB3_NM_K_P|1373687445|vivnel2|L_VB3_GH_K_P|promo_camp1-bd153424349bc647|1"

logger_re = re.compile(
"logger: ([^ ]+)\
 submit date:(\d+)\
 done date:(\d+)\
 stat:(.+)\
 err:(.+)$")

print logger_re.search(s).groups()

The .groups() method returns back a tuple of the strings found within the () parenthesis.

See http://docs.python.org/2/library/re.html

split string to be inserted into database

Answers (2)

Related Questions