Reputation: 1182
I have a regular expression with two groups that are OR'd and I'm wondering if it's possible to have a group be a back reference only if it matched? In all cases, I'm wanting to match spam.eggs.com
Example:
import re
monitorName = re.compile(r"HQ01 : HTTP Service - [Ss][Rr][Vv]\d+\.\w+\.com:(\w+\.\w+\.(?:net|com|org))|(\w+\.\w+\.(?:net|com|org))")
test = ["HQ01 : HTTP Service - spam.eggs.com",
"HQ01 : HTTP Service - spam.eggs.com - DISABLED",
"HQ01 : HTTP Service - srv04.example.com:spam.eggs.com",
"HQ01 : HTTP Service - srv04.example.com:spam.eggs.com - DISABLED"]
for t in test:
m = monitorName.search(t)
print m.groups()
Produces:
(None, 'spam.eggs.com')
(None, 'spam.eggs.com')
('spam.eggs.com', None)
('spam.eggs.com', None)
It'd be nice if my groups would only return my one matched group and not both.
Upvotes: 1
Views: 735
Reputation: 33994
Did you consider this?
HQ01 : HTTP Service - (?:[Ss][Rr][Vv]\d+\.\w+\.com:)?(\w+\.\w+\.(?:net|com|org))
Upvotes: 0
Reputation: 20450
I will rewrite the regular expression to be
monitorName = re.compile(r"HQ01 : HTTP Service - (?:(?i)SRV\d+\.\w+\.com:)?(\w+\.\w+\.(?:net|com|org))")
Produces
('spam.eggs.com',)
('spam.eggs.com',)
('spam.eggs.com',)
('spam.eggs.com',)
You can make group optional by tailing with ?
.
Upvotes: 0
Reputation: 7523
The |
operator has early precedence so it applies to everything before it (from the beginning of your regex in this case) OR everything after it. In your regex, if there is no "srv04.example.com", it isn't checking if the string contains "HTTP Service"!
Your two capturing groups are identical, so there's no point in having both. All you want is to have the srv*:
part optional, right?
Try this one:
r"HQ01 : HTTP Service - (?:[Ss][Rr][Vv]\d+\.\w+\.com:)?(\w+\.\w+\.(?:net|com|org))"
Upvotes: 2