Reputation: 135
I have a text file and I want to remove all newline characters in between the adjacent lines where both have only 'capital letter' words/characters. So if one line is ABCD
and the next line is AB
, the result should be ABCD AB
. I can do it with looping over the text line by line, but I need a more elegant way preferably with regex. Here is a text example:
ABCD
AB
abcd ABB
cd
AB
ABC
ABCD
ab
and I want to get this:
ABCD AB
abcd ABB
cd
AB ABC ABCD
ab
I've written the following, but only works for two capital lines in a row and not more.
r = re.compile(r'(\n)([A-Z ]+)(\n)([A-Z ]+)(\n)')
text = r.sub(r'\1\2 \4\5',text)
Assume there are no other complexities than this (the text is clean already as the example is). I am a newbie struggling to learn regex! Thanks.
Upvotes: 1
Views: 66