Creek Barbara
Creek Barbara

Reputation: 647

Regex/Python: Substitution in Python when the Regex already do the substitution

I'm trying to remove duplicate lines with this regex that works great:

(.*+)\n*(\1\n+)* 

But when I try to use it in Python it doesn't work:

response1 = re.sub(r'(.*+)\n*', r'(\1\n+)*', response1)

Error:

Exception has occurred: re.error
multiple repeat at position 3

Am I doing something wrong?

Thank you,

Upvotes: 0

Views: 157

Answers (1)

user2468968
user2468968

Reputation: 286

The "multiple repeat at position 3" problem is with the regex:

.*+

You can use either ".*" or ".+". Something like the following should remove consecutive duplicated lines:

response = """A
A    
A
B
B
A
A
"""
print(re.sub(r'(.*\n)(\1)+', r'\2', response))

Output

A
B
A

Upvotes: 1

Related Questions