Reputation: 17708
I need to remove repetitive words in string so that 'the (the)'
will become 'the'
. Why can't I do it as follows?
re.sub('(.+) \(\1\)', '\1', 'the (the)')
Thanks.
Upvotes: 4
Views: 2048
Reputation: 31518
You need to doubly escape the back-reference:
re.sub('(.+) \(\\1\)', '\\1', 'the (the)')
--> the
Or use the r
prefix:
When an "r" or "R" prefix is present, a character following a backslash is included in the string without change, and all backslashes are left in the string.
re.sub(r'(.+) \(\1\)', r'\1', 'the (the)')
--> the
Upvotes: 6
Reputation: 7530
According to documentation: 'Raw string notation (r"text") keeps regular expressions sane.'
Upvotes: 2