koogee
koogee

Reputation: 963

python regex matching specific word only, not a subset

I'm trying to search for specific words using regex in python.

lst2 = ['Azmat', 'AZ', 'azim', 'Zard', 'Zardari']

pattern = re.compile(r"\bAZ|Zard\b", re.I)

for item in lst2:
    if re.search(pattern, item):
        print item

This code produces:

Azmat
AZ
azim
Zard

Why is it not matching "AZ" and "Zard" only?

Upvotes: 1

Views: 939

Answers (3)

Jerry
Jerry

Reputation: 71598

It's because your regex is matching either:

\bAZ

OR

Zard\b

Use a non-capture group to limit the 'influence' of the | operator:

\b(?:AZ|Zard)\b

This way, it reads: \b then either AZ OR Zard and last \b.

Upvotes: 4

Victor Bocharsky
Victor Bocharsky

Reputation: 12306

What about:

pattern = re.compile(r"^(AZ|Zard)$", re.I)

better show start and end of string with ^ and $

Upvotes: 2

Ricardo Cárdenes
Ricardo Cárdenes

Reputation: 9172

Your current code is looking for a word starting with az or finishing with zard. Fix it like this:

pattern = re.compile(r"\b(AZ|Zard)\b", re.I)

Upvotes: 3

Related Questions