Taldakus
Taldakus

Reputation: 715

Replace a string using dictionary - regex

I have a dictionary of slangs with their meanings and I want to replace all the slangs in my text.

I have found partially working solution https://stackoverflow.com/a/2400577

For now my code looks like this:

import re

myText = 'brb some sample text I lov u. I need some $$ for 2mw.'

dictionary = {
  'brb': 'be right back',
  'lov u': 'love you',
  '$$': 'money',
  '2mw': 'tomorrow'
}

pattern = re.compile(r'\b(' + '|'.join(re.escape(key) for key in dictionary.keys()) + r')\b')
result = pattern.sub(lambda x: dictionary[x.group()], myText)

print(result)

Output:

be right back some sample text I love you. I need some $$ for tomorrow.

As you can see sings $$ haven't changed and I know it is due to \b syntax. How can I change my regex to achieve my goal?

Upvotes: 4

Views: 4840

Answers (1)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 627607

Replace the word boundaries with lookarounds that check for any word chars around the search phrase

pattern = re.compile(r'(?<!\w)(' + '|'.join(re.escape(key) for key in dictionary.keys()) + r')(?!\w)')

See the Python demo

The (?<!\w) negative lookbehind fails the match if there is a word char immediately to the left of the current location and the (?!\w) negative lookahead fails the match if there is a word char immediately to the right of the current location.

Replace (?<!\w) with (?<!\S) and (?!\w) with (?!\S) if you need to only match search phrases in between whitespace chars and start/end of string.

Upvotes: 2

Related Questions