Reputation: 902
I have this following text file:
It’s hard to explain puns to kleptomaniacs because they always take things literally.
I used to think the brain was the most important organ. Then I thought, look what’s telling me that.
I use the following script to get rid of the numberings and newlines:
import re
with open('jokes.txt', 'r+') as original_file:
modfile = original_file.read()
modfile = re.sub("\d+\. ", "", modfile)
modfile = re.sub("\n", "", modfile)
original_file.seek(0)
original_file.truncate()
original_file.write(modfile)
After running the script, this how my text file is:
It’s hard to explain puns to kleptomaniacs because they always take things literally. I used to think the brain was the most important organ. Then I thought, look what’s telling me that.
I'd like the file to be:
It’s hard to explain puns to kleptomaniacs because they always take things literally.
I used to think the brain was the most important organ. Then I thought, look what’s telling me that.
How do I delete the new lines without mending all the lines?
Upvotes: 0
Views: 60
Reputation: 71548
You can use a single replace, with the following regex:
re.sub(r"\d+\. |(?<!^)\n", "", modfile, flags=re.MULTILINE)
(?<!^)\n
will match a newline unless it's at the start of a line. The flag re.MULTILINE
makes ^
match every beginning of line.
In code:
import re
with open('jokes.txt', 'r+') as original_file:
modfile = original_file.read()
midfile = re.sub(r"\d+\. |(?<!^)\n", "", modfile, flags=re.MULTILINE)
original_file.seek(0)
original_file.truncate()
original_file.write(modfile)
You can also use a negative lookahead instead of a lookbehind if you want:
r"\d+\. |\n(?!\n)"
Upvotes: 2