Regex to match word but only if it doesn't start with a non-alphanumerical character

Question

I have sentences that I want to identify words in, but not if it starts with an alphanumerical character. It's fine if it ends with one though.

An example of what I've done;

words = ["THIS", "THAT"]
sentences = ["I want to identify THIS word.", "And THAT!", "But I do not want to identify !THIS word", "Or [THIS] word"] 

for sentence in sentences:
        for word in words:
                word_re = re.search(r"\b(%s)\b" %word, sentence) 
                if word_re:
                    print("It's a match!")

My output of the code above would be a match in each of the sentences. I would like something that only matches in the first two sentences. Is it possible to do what I want with regex?

Thanks!

Wiktor Stribiżew · Accepted Answer

You can use a regex like

(?


See the regex demo. Details:

(? - a left-hand whitespace boundary

(?:THIS|THAT) - a non-capturing group matching either THIS or THAT
\b -  a word boundary.

See the Python demo:
import re
words = ["THIS", "THAT"]
sentences = ["I want to identify THIS word.", "And THAT!", "But I do not want to identify !THIS word", "Or [THIS] word"] 

pattern = fr"(? 'I want to identify THIS word.' is a match!
#    'And THAT!' is a match!

If THIS or THAT can contain special chars, replace pattern = fr"(? with pattern = fr"(?.

Regex to match word but only if it doesn't start with a non-alphanumerical character

Answers (1)

Related Questions

Regex to match word but only if it doesn&#39;t start with a non-alphanumerical character

Answers (1)

Related Questions

Regex to match word but only if it doesn't start with a non-alphanumerical character