Reputation: 527
I am using python to parse Postfix logfiles. I need to match lines containing any of multiple patterns, and extract IP address if line matches
ip = re.search('^warning: Connection rate limit exceeded: [0-9]* from .*\[([0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3})\] for service smtp', message)
if not ip:
ip = re.search('^NOQUEUE: reject: RCPT from .*\[([0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3})\]: .*: Relay access denied; .*', message)
if not ip:
ip = re.search('^NOQUEUE: reject: RCPT from .*\[([0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3})\]: .*: Recipient address rejected: .*', message)
...
...
print ip.group(1)
Any line will only ever match one pattern. I know that normaly I can use '(pattern1|pattern2|pattern3)' to match any of multiple patterns, but since I am alredy using parenthesis ()
to group the IP address which I want to extract, I don't know how to do that.
I will have quite a lot of patterns to match. What would be the most clean/elegant way to do it ?
Upvotes: 3
Views: 1320
Reputation: 474151
You can use a non-capturing group:
patterns = [
"warning: Connection rate limit exceeded: [0-9]* from .*\[([0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3})\] for service smtp",
"NOQUEUE: reject: RCPT from .*\[([0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3})\]: .*: Relay access denied; .*",
"NOQUEUE: reject: RCPT from .*\[([0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3})\]: .*: Recipient address rejected: .*"
]
pattern = re.compile("^(?:" + "|".join(patterns) + ")")
ip = pattern.search(message)
Upvotes: 3