Imu
Imu

Reputation: 555

python3 regex match pattern but only if it does not end in certain character

I have a question on how to do replace a string pattern but only if it does not end in an exclamation.

For example, "Thanks, Bob" or "Thanks, Bob." should be replaced with "Thanks, [NAME]" but "Thanks, Bob!" should NOT be replaced.

So far I have this:

regex = r"Thanks\,(\s)?(\n+)?[A-Z]?[a-z]+[^!]"
re.sub(regex, "Thanks, [NAME]", text)

This works for the case where you have punctuation after "Bob" but won't work for the case "Thanks, Bob"

Any ideas?

Upvotes: 1

Views: 52

Answers (2)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626738

You may use

(Thanks,\s*)[A-Z][a-z]+\b(?!!)

and replace with \1[NAME]. See regex demo and the regex graph:

enter image description here

The point is that you need to use a word boundary \b after [a-z]+ and add a negative lookahead (?!!) right after.

Details

  • (Thanks,\s*) - Group 1 (\1 in the replacement pattern): Thanks, and 0+ whitespaces (\s*)
  • [A-Z][a-z]+ - an uppercase letter and then 1+ lowercase ones
  • \b - a word boundary, the next char cannot be letter/digit/_
  • (?!!) - no ! immediately to the right of the current location is allowed.

Python demo:

import re
rx = r"(Thanks,\s*)[A-Z][a-z]+\b(?!!)"
strs = ["Thanks, Bob", "Thanks, Bob.", "Thanks, Bob!"]
for s in strs: 
    print( re.sub(rx, r"\1[NAME]", s) )

Output:

Thanks, [NAME]
Thanks, [NAME].
Thanks, Bob!

Upvotes: 1

Emma
Emma

Reputation: 27723

My guess is that your expression is just fine, we'd be slightly modifying that to:

^Thanks\s*,\s*([A-Z]?[a-z]*)\s*[^!]?$

Demo

Test

import re

regex = r"^Thanks\s*,\s*([A-Z]?[a-z]*)\s*[^!]?$"

test_str = ("Thanks, Bob\n"
    "Thanks, Bob.\n"
    "Thanks, Bob!")

subst = "Thanks, [NAME]"

result = re.sub(regex, subst, test_str, 0, re.MULTILINE)

if result:
    print (result)

Upvotes: 0

Related Questions