razz
razz

Reputation: 10120

How to replace repeated pattern of characters

I have a string that has pairs of random characters repeating 3 times within it, for ex "abababwhatevercdcdcd" and i want to remove these pairs to get the rest of the string, like "whatever" in the former example, how do i do that?

I tried the following:

import re
re.sub(r'([a-z0-9]{2}){3}', r'', string)

but it does not work

Upvotes: 1

Views: 994

Answers (2)

jscs
jscs

Reputation: 64002

You need backreferences here in order to repeat the match that was actually made, as opposed to trying to make a new match with the same pattern:

([a-z0-9]{2})\1\1

>>> import re
>>> re.sub(r'([a-z0-9]{2})\1\1', r'', "abababwhatevercdcdcd")
'whatever'
>>> re.sub(r'([a-z0-9]{2})\1\1', r'', "wabababhatevercdcdcd")
'whatever'

Upvotes: 4

Casimir et Hippolyte
Casimir et Hippolyte

Reputation: 89574

For more than one character, you can use :

(.{2,})\1+

Upvotes: 1

Related Questions