Joshua
Joshua

Reputation: 269

Python RegEx - Getting multiple pieces of information out of a string

I'm trying to use python to parse a log file and match 4 pieces of information in one regex. (epoch time, SERVICE NOTIFICATION, hostname and CRITICAL) I can't seem to get this to work. So Far I've been able to only match two of the four. Is it possible to do this? Below is an example of a string from the log file and the code I've gotten to work thus far. Any help would make me a happy noob.

[1242248375] SERVICE ALERT: myhostname.com;DNS: Recursive;CRITICAL;SOFT;1;CRITICAL - Plugin timed out while executing system call

hostname = options.hostname

n = open('/var/tmp/nagios.log', 'r')
n.readline()
l = [str(x) for x in n]
for line in l:
    match = re.match (r'^\[(\d+)\] SERVICE NOTIFICATION: ', line)
    if match:
       timestamp = int(match.groups()[0])
       print timestamp

Upvotes: 2

Views: 15426

Answers (5)

Mike Kale
Mike Kale

Reputation: 4133

You can use more than one group at a time, e.g.:

import re

logstring = '[1242248375] SERVICE ALERT: myhostname.com;DNS: Recursive;CRITICAL;SOFT;1;CRITICAL - Plugin timed out while executing system call'
exp = re.compile('^\[(\d+)\] ([A-Z ]+): ([A-Za-z0-9.\-]+);[^;]+;([A-Z]+);')
m = exp.search(logstring)

for s in m.groups():
    print s

Upvotes: 2

user114075
user114075

Reputation: 11

If you are looking to split out those particular parts of the line then.

Something along the lines of:

match = re.match(r'^\[(\d+)\] (.*?): (.*?);.*?;(.*?);',line)

Should give each of those parts in their respective index in groups.

Upvotes: 1

Dietrich Epp
Dietrich Epp

Reputation: 213298

The question is a bit confusing. But you don't need to do everything with regular expressions, there are some good plain old string functions you might want to try, like 'split'.

This version will also refrain from loading the entire file in memory at once, and it will close the file even when an exception is thrown.

regexp = re.compile(r'\[(\d+)\] SERVICE NOTIFICATION: (.+)')
with open('var/tmp/nagios.log', 'r') as file:
    for line in file:
        fields = line.split(';')
        match = regexp.match(fields[0])
        if match:
            timestamp = int(match.group(1))
            hostname = match.group(2)

Upvotes: 2

Oddthinking
Oddthinking

Reputation: 25282

Could it be as simple as "SERVICE NOTIFICATION" in your pattern doesn't match "SERVICE ALERT" in your example?

Upvotes: 0

Alex Martelli
Alex Martelli

Reputation: 881555

You can use | to match any one of various possible things, and re.findall to get all non-overlapping matches to some RE.

Upvotes: 6

Related Questions