mark
mark

Reputation: 62774

What is the recommended way to replace multiple strings in one large string in Python?

I have many string pairs and one large string (which is the content of a file). I need to replace each and every occurrence of the first member in each pair with the respective second one.

For instance, having pairs ("AA", "BB") and ("qq", "rt") I need to replace every occurrence of AA with BB and of qq with rt.

The strings in the pairs are all unique, so the order of replacements does not matter on the final result.

My python code is the most naive - I apply the string.replace method in succession until all the pairs are exhausted:

>>> s="frsfsdAAsdfvsdfvqqdsff"
>>> pairs=[('AA', 'BB'), ('qq', 'rt')]
>>> for p in pairs:
...   s=s.replace(p[0], p[1])
...
>>> s
'frsfsdBBsdfvsdfvrtdsff'
>>>

I believe this is a bad solution for large strings. Can anyone suggest a more efficient one?

The question is about how to do it in Python.

Thanks.

Upvotes: 4

Views: 300

Answers (1)

Mark Byers
Mark Byers

Reputation: 838376

There's something else wrong with your proposed solution: after the first replacement is made the resulting string could match and the same characters could be replaced again. For example your solution would not give the desired result if you tried to swap 'qq' and 'ff' by setting pairs = [('qq','ff'), ('ff','qq')].

You could try this instead:

>>> d = dict(pairs)
>>> import re
>>> pattern = re.compile('|'.join(re.escape(k) for k in d))
>>> pattern.sub(lambda k:d[k.group()], s))
frsfsdBBsdfvsdfvrtdsff

Upvotes: 3

Related Questions