python3 regex match pattern but only if it does not end in certain character

Question

I have a question on how to do replace a string pattern but only if it does not end in an exclamation.

For example, "Thanks, Bob" or "Thanks, Bob." should be replaced with "Thanks, [NAME]" but "Thanks, Bob!" should NOT be replaced.

So far I have this:

regex = r"Thanks\,(\s)?(
+)?[A-Z]?[a-z]+[^!]"
re.sub(regex, "Thanks, [NAME]", text)

This works for the case where you have punctuation after "Bob" but won't work for the case "Thanks, Bob"

Any ideas?

Wiktor Stribiżew · Accepted Answer

You may use

(Thanks,\s*)[A-Z][a-z]+\b(?!!)

and replace with \1[NAME]. See regex demo and the regex graph:

The point is that you need to use a word boundary \b after [a-z]+ and add a negative lookahead (?!!) right after.

Details

(Thanks,\s*) - Group 1 (\1 in the replacement pattern): Thanks, and 0+ whitespaces (\s*)
[A-Z][a-z]+ - an uppercase letter and then 1+ lowercase ones
\b - a word boundary, the next char cannot be letter/digit/_
(?!!) - no ! immediately to the right of the current location is allowed.

Python demo:

import re
rx = r"(Thanks,\s*)[A-Z][a-z]+\b(?!!)"
strs = ["Thanks, Bob", "Thanks, Bob.", "Thanks, Bob!"]
for s in strs: 
    print( re.sub(rx, r"\1[NAME]", s) )

Output:

Thanks, [NAME]
Thanks, [NAME].
Thanks, Bob!

python3 regex match pattern but only if it does not end in certain character

Answers (2)

Demo

Test

Related Questions