Regex that splits long text in separate sentences with match()

Question

This is a textarea where the user writes some text. I've written an example in it.

First sentence. Second sentence? Third sentence!
Fourth sentence.

Fifth sentence

Requirements already considered in the regex

separator is included in array item
last sentence doesn't necessarily require a separator character (it can end with any character)
if a sentence has more than one separator char, it is included in the array item. Example: second sentence?!? should be [...,"second sentence?!?",...]

Missing requirement (I need help with this) <<

Each new line should be represented by an empty array item. If the regex is applied, this should be the response:

["First sentence.", "Second sentence?", "Third sentence!", "", "Fourth sentence.", "", "", "Fifth sentence"]

Instead, I'm receiving this:

["First sentence.", "Second sentence?", "Third sentence!", "Fourth sentence.", "Fifth sentence"]

This is the regex and match call:

var tregex = /[^
.!?]+(:?(:?
|[
]|[.!?])+|$)/gi;
var sentences = $('#text').val().match(tregex).map($.trim);

Any ideas? Thanks!

matt3141 · Accepted Answer

I simplified it a lot, either match the end of a line (new line) or a sentence followed by punctuation:

var tregex = /
|([^
.!?]+([.!?]+|$))/gim;

I also believe the m flag for multiline is important

Answers (2)