Reputation: 62774
I have many string pairs and one large string (which is the content of a file). I need to replace each and every occurrence of the first member in each pair with the respective second one.
For instance, having pairs ("AA", "BB") and ("qq", "rt") I need to replace every occurrence of AA with BB and of qq with rt.
The strings in the pairs are all unique, so the order of replacements does not matter on the final result.
My python code is the most naive - I apply the string.replace method in succession until all the pairs are exhausted:
>>> s="frsfsdAAsdfvsdfvqqdsff"
>>> pairs=[('AA', 'BB'), ('qq', 'rt')]
>>> for p in pairs:
... s=s.replace(p[0], p[1])
...
>>> s
'frsfsdBBsdfvsdfvrtdsff'
>>>
I believe this is a bad solution for large strings. Can anyone suggest a more efficient one?
The question is about how to do it in Python.
Thanks.
Upvotes: 4
Views: 300
Reputation: 838376
There's something else wrong with your proposed solution: after the first replacement is made the resulting string could match and the same characters could be replaced again. For example your solution would not give the desired result if you tried to swap 'qq'
and 'ff'
by setting pairs = [('qq','ff'), ('ff','qq')]
.
You could try this instead:
>>> d = dict(pairs)
>>> import re
>>> pattern = re.compile('|'.join(re.escape(k) for k in d))
>>> pattern.sub(lambda k:d[k.group()], s))
frsfsdBBsdfvsdfvrtdsff
Upvotes: 3