Reputation:
I wanna replace all the chars which occur more than one time,I used Python's re.sub and my regex looks like this data=re.sub('(.)\1+','##',data)
, But nothing happened...
Here is my Text:
Text
※※※※※※※※※※※※※※※※※Chapter One※※※※※※※※※※※※※※※※※※
This is the begining...
Upvotes: 1
Views: 90
Reputation: 250891
You need to use raw string here, 1
is interpreted as octal and then its ASCII value present at its integer equivalent is used in the string.
>>> '\1'
'\x01'
>>> chr(01)
'\x01'
>>> '\101'
'A'
>>> chr(0101)
'A'
Use raw string to fix this:
>>> '(.)\1+'
'(.)\x01+'
>>> r'(.)\1+' #Note the `r`
'(.)\\1+'
Upvotes: 3
Reputation: 280291
Use a raw string, so the regex engine interprets backslashes instead of the Python parser. Just put an r
in front of the string:
data=re.sub(r'(.)\1+', '##', data)
^ this r is the important bit
Otherwise, \1
is interpreted as character value 1 instead of a backreference.
Upvotes: 1