Replace words in string by exact match in dictionaries

Question

text = "One sentence with one (two) three, but mostly one. And twos."

Desired result: A sentence with A (B) C, but mostly A. And twos.

Words should be replaced according to an exact match in lookup_dict. Therefore two in twos should not be replaced, as there is an additional letter in the word. Yet words next to spaces, commas, paranthesis and periods should be replaced.

lookup_dict = {'var': ["one", "two", "three"]}
match_dict = {'var': ["A", "B", "C"]}

var_dict = {}

for i,v in enumerate(lookup_dict['var']):
    var_dict[v] = match_dict['var'][i]
    xpattern = re.compile('|'.join(var_dict.keys()))
    result = xpattern.sub(lambda x: var_dict[x.group()], text.lower())

result: A sentence with A (B) C, but mostly A. and Bs.

Can I achieve the desired output without adding every possible combination of words + adjacent characters to the dictionaries? This seems unnecessarily complicated:

lookup_dict = {'var':['one ', 'one,', '(one)', 'one.', 'two ', 'two,', '(two)', 'two.', 'three ', 'three,', '(three)' 'three.']
...
result = xpattern.sub(lambda x: var_dict[x.group()] if x.group() in lookup_dict['var'] else x.group(), text.lower())

engineer14 · Accepted Answer

w = "Where are we one today two twos them"
lookup_dict = {"one":"1", "two":"2", "three":"3"}
pattern = re.compile(r'\b(' + '|'.join(lookup_dict.keys()) + r')\b')
output = pattern.sub(lambda x: lookup_dict[x.group()],w)

This would print out 'Where are we 1 today 2 twos them'

basically,

I updated your dictionary to use keys for each entry.

Created a regex which basically matches any of the items in your dictionary, using the regex \b(every|key|in|your|dictionary)\b to match either items a,b,c. And use the word boundaries around it to match anything not part of a word. ie spaces, carets etc.

Then using the pattern, substitute all the matches that occurred

Replace words in string by exact match in dictionaries

Answers (2)

Related Questions