Reputation: 13175
I wish to chop some text into sentences.
I wish to match all text up until: a period followed by a space, a question mark followed by a space or an exclamation mark followed by a space, in an non greedy fashion.
Additionally, the punctuation might be found at the very end of the string or followed by a /r/n for example.
This will almost do it:
([^\.\?\!]*)
But I'm missing the space in the expression. How do I fix this?
Example:
I' a.m not. So? Sure about this! Actually.
Should give:
I' a.m not
So
Sure about this
Actually
Upvotes: 1
Views: 178
Reputation: 425348
Use a non-greedy match with s look ahead:
^.*?(?=[.!?]( |$))
Note how you don't have to escape those chars when they are in a character class [...]
.
Upvotes: 1
Reputation: 93046
You can achieve such conditions by using positive lookahead assertions.
[^.?!]+(?=[.?!] )
See it here on Regexr.
When you look at the demo, The sentences at the end of a row with no following space are not matched. You can fix this by adding an alternation with the Anchor $
and using the modifier m
(makes the $
match the end of a row):
[^.?!]+(?=[.?!](?: |$))
See it here on Regexr
Upvotes: 2
Reputation: 3002
Try this:
(.*?[!\.\?] )
.* gives all,
[] is any of these characters
then the () gives you a group to reference so you can get the match out.
Upvotes: 1