Flying falcon
Flying falcon

Reputation: 133

How to match an exact word inside a string?

op = ['TRAIL_RATE_ID 8 TRAIL_RATE_NAME VC-4 TRAIL_ORDER High Order ',    'TRAIL_RATE_ID 9 TRAIL_RATE_NAME VC4-4 TRAIL_ORDER High Order ' , 'TRAIL_RATE_ID 10 TRAIL_RATE_NAME VC-8 TRAIL_ORDER High Order ']
word = "8"
for op1 in op:
    pp=re.search('(\\b'+word +'\\b)', op1, flags=re.IGNORECASE|re.DOTALL)
    print bool(pp)

matches 2 occurrences of 8.

I want it to match only the first occurrence. The word can be word= "8" word = "$#hhd" word = "hi hello"

How do I match this using regex?

Upvotes: 2

Views: 1203

Answers (2)

anubhava
anubhava

Reputation: 785128

Word boundaries won't help because - is not considered a word character.

You can use lookarounds:

p = re.compile(r'(?:(?<=^)|(?<=\s))' + word + r'(?=\s|$)', flags=re.IGNORECASE|re.M)
re.search(p, op1)

Code Demo

  • (?<=^)|(?<=\s) is a lookbehind to ensure we have line start or whitespace before our word
  • (?=\s|$) is a lookahead to ensure we have line end or whitespace next to our word

Upvotes: 4

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626802

You can require that there should not be a non-whitespace symbol on both sides of the word:

r'(?<!\S){0}(?!\S)'.format(re.escape(word))

See the regex demo

I added re.escape(word) in case your keywords contain special regex metacharacters that should be treated literally.

See Python demo:

import re
word = "8"
pat = r'(?<!\S){0}(?!\S)'.format(re.escape(word))
print re.search(pat,"nnn 8", flags=re.IGNORECASE)

Upvotes: 6

Related Questions