Baz
Baz

Reputation: 13175

Match a sentence

I wish to chop some text into sentences.

I wish to match all text up until: a period followed by a space, a question mark followed by a space or an exclamation mark followed by a space, in an non greedy fashion.

Additionally, the punctuation might be found at the very end of the string or followed by a /r/n for example.

This will almost do it:

([^\.\?\!]*)

But I'm missing the space in the expression. How do I fix this?

Example:

I' a.m not. So? Sure about this! Actually. Should give:

I' a.m not
So
Sure about this
Actually

Upvotes: 1

Views: 178

Answers (4)

Bohemian
Bohemian

Reputation: 425348

Use a non-greedy match with s look ahead:

^.*?(?=[.!?]( |$))

Note how you don't have to escape those chars when they are in a character class [...].

Upvotes: 1

stema
stema

Reputation: 93046

You can achieve such conditions by using positive lookahead assertions.

[^.?!]+(?=[.?!] )

See it here on Regexr.

When you look at the demo, The sentences at the end of a row with no following space are not matched. You can fix this by adding an alternation with the Anchor $ and using the modifier m (makes the $ match the end of a row):

[^.?!]+(?=[.?!](?: |$))

See it here on Regexr

Upvotes: 2

Erica Tripp
Erica Tripp

Reputation: 326

This should do it:

^.*?(?=[!.?][\s])

Upvotes: 0

Zack Newsham
Zack Newsham

Reputation: 3002

Try this:

(.*?[!\.\?] )

.* gives all,

[] is any of these characters

then the () gives you a group to reference so you can get the match out.

Upvotes: 1

Related Questions