Reputation: 99
I have a text like this in a text file
Long sleeve wool coat in black. Breast pocket.
and I want an output where every sentence is printed in the next line something like this.
Long sleeve wool coat in black.
Breast pocket.
I tried the following question but as it was asked it's giving the output as
Long sleeve wool coat in black.
Breast pocket.
None
and also I have to do this to multiple text files reading from the original file I have to overwrite that file in this way breaking up the lines. But when I try doing that only None is getting written to it not the existing lines.
Any help is appreciated thanks in advance.
Upvotes: 2
Views: 2290
Reputation: 2711
Try:
in_s = 'Long sleeve wool coat in black. Breast pocket.'
in_s += ' '
out = in_s.split('. ')[:-1]
print('.\n'.join(out))
Explanation:
in_s += ' '
add a space at the end of the string so that it ends in `'. `` like any other sentence....in_s.split('. ')...
split the text wherever there is a period followed by a space ('. '
)....[:-1]
remove the last value, which, if the text ends in a period and a space, will be None
...'\n.join(out)
seperate the values with a period and newline before printing.Upvotes: 2
Reputation: 43169
Do yourself a favour and use nltk
instead of regular expressions or even a simple str.split()
:
from nltk import sent_tokenize
string = "Long sleeve wool coat in black. Breast pocket. Mr. Donald Trump is the president of the U.S.A."
for sent in sent_tokenize(string):
print(sent)
Which yields
Long sleeve wool coat in black.
Breast pocket.
Mr. Donald Trump is the president of the U.S.A.
This approach most likely works even for edge cases while most others won't.
Upvotes: 2
Reputation: 4234
Try:
s = 'Long sleeve wool coat in black. Breast pocket.'
print(s.replace('. ', '.\n'))
Upvotes: 3