ajwdev
ajwdev

Reputation: 1182

"Optional" backreferences in regular expression

I have a regular expression with two groups that are OR'd and I'm wondering if it's possible to have a group be a back reference only if it matched? In all cases, I'm wanting to match spam.eggs.com

Example:

import re

monitorName = re.compile(r"HQ01 : HTTP Service - [Ss][Rr][Vv]\d+\.\w+\.com:(\w+\.\w+\.(?:net|com|org))|(\w+\.\w+\.(?:net|com|org))")

test = ["HQ01 : HTTP Service - spam.eggs.com",
    "HQ01 : HTTP Service - spam.eggs.com - DISABLED",
    "HQ01 : HTTP Service - srv04.example.com:spam.eggs.com",
    "HQ01 : HTTP Service - srv04.example.com:spam.eggs.com - DISABLED"]


for t in test:
    m = monitorName.search(t)
    print m.groups()

Produces:

(None, 'spam.eggs.com')
(None, 'spam.eggs.com')
('spam.eggs.com', None)
('spam.eggs.com', None)

It'd be nice if my groups would only return my one matched group and not both.

Upvotes: 1

Views: 735

Answers (5)

Antony Hatchkins
Antony Hatchkins

Reputation: 33994

Did you consider this?

HQ01 : HTTP Service - (?:[Ss][Rr][Vv]\d+\.\w+\.com:)?(\w+\.\w+\.(?:net|com|org))

Upvotes: 0

livibetter
livibetter

Reputation: 20450

I will rewrite the regular expression to be

monitorName = re.compile(r"HQ01 : HTTP Service - (?:(?i)SRV\d+\.\w+\.com:)?(\w+\.\w+\.(?:net|com|org))")

Produces

('spam.eggs.com',)
('spam.eggs.com',)
('spam.eggs.com',)
('spam.eggs.com',)

You can make group optional by tailing with ?.

Upvotes: 0

Nicolás
Nicolás

Reputation: 7523

The | operator has early precedence so it applies to everything before it (from the beginning of your regex in this case) OR everything after it. In your regex, if there is no "srv04.example.com", it isn't checking if the string contains "HTTP Service"!

Your two capturing groups are identical, so there's no point in having both. All you want is to have the srv*: part optional, right?

Try this one:

r"HQ01 : HTTP Service - (?:[Ss][Rr][Vv]\d+\.\w+\.com:)?(\w+\.\w+\.(?:net|com|org))"

Upvotes: 2

Ignacio Vazquez-Abrams
Ignacio Vazquez-Abrams

Reputation: 798804

Use m.group(1) or m.group(2).

Upvotes: 1

kennytm
kennytm

Reputation: 523374

m = monitorName.search(t)
g = m.groups()
print g[0] or g[1]

Upvotes: 1

Related Questions