Reputation: 119
I'm trying to remove two line breaks character at the end of each line in a text file, but only if it follows a lowercase letter, i.e. [a-z]. I have the following code:
import re
text = "I want this \n\nwith one line break \n\nIn this place \nNothing happens \n\nHere\nnothing should\nhappen too."
texts = re.sub(r"(?<=[a-z])\r?\n\n"," ", text)
print(texts)
Text that I want to obtain
text = "I want this \nwith one line break \n\nIn this place \nNothing happens \n\nHere\nnothing should\nhappen too."
With my code nothing changes in the text. My code follows this answered question: Regular expression to remove line breaks but I am unable to adapt it to two line breaks.
Upvotes: 0
Views: 121
Reputation: 54148
It seems you need a positive lookahead , not a lookbehind, to have a lowercase char after
import re
expected = "I want this \nwith one line break \n\nIn this place \nNothing happens \n\nHere\nnothing should\nhappen too."
text = "I want this \n\nwith one line break \n\nIn this place \nNothing happens \n\nHere\nnothing should\nhappen too."
texts = re.sub(r"\n\n(?=[a-z])", "\n", text)
print(texts == expected)
Upvotes: 1