Reputation: 6039
I want to replace dot / ? / ! followed by spaced (if any) to a breakline char \n
and eliminate the whitespaces.
So in case of : hello world. It's nice.
I want it to be hello world.\nIt'snice.\n
This is what I thought of (but it doesn't work, otherwise I wouldn't write this question ha? )
re.sub(r'\.!?( *)', r'.\n\1', line)
Thanks !
Upvotes: 0
Views: 8224
Reputation: 123608
Without lookaround:
>>> import re
>>> line="hello world! What? It's nice."
>>> re.sub(r'([.?!]+) *', r'\1\n', line) # Capture punctuations; discard spaces
"hello world!\nWhat?\nIt's nice.\n"
>>> line="hello world! His thoughts trailed away... What?"
>>> re.sub(r'([.?!]+) *', r'\1\n', line)
'hello world!\nHis thoughts trailed away...\nWhat?\n'
Upvotes: 3
Reputation: 1123830
Match spaces or the end of the string with a positive look-behind:
re.sub(r'(?<=[.?!])( +|\Z)', r'\n', text)
Because this matches just spaces that are preceded by punctuation, you don't need to use a back reference.
The +
ensures that only punctuation followed by a space is matched here. The text:
"His thoughts trailed away... His heart wasn't in it!"
would otherwise receive too many newlines.
Demo:
>>> import re
>>> text = "hello world. It's nice."
>>> re.sub(r'(?<=[.?!])( +|\Z)', r'\n', text)
"hello world.\nIt's nice.\n"
>>> text = "His thoughts trailed away... His heart wasn't in it!"
>>> re.sub(r'(?<=[.?!])( +|$)', r'\n', text)
"His thoughts trailed away...\nHis heart wasn't in it!\n"
Upvotes: 2
Reputation: 25351
Did you try replace
?
print text.replace('. ','.\n').replace('? ','?\n').replace('! ','!\n')
Upvotes: 0