Reputation: 10974
This is a textarea. The user can write anything.
<textarea id="text">First sentence. Second sentence? Third sentence!
Fourth sentence.
Fifth sentence
</textarea>
At the end, i have to split all the text into an array.
var sentences = $('#text').val().split(/\r\n|\r|\n|[.|!|?]\s/gi);
The issue i'm having, is that the separator characters are not present in the array item values. This is what sentences is returning:
["First sentence", "Second sentence", "Third sentence", "Fourth sentence", "Fifth sentence"]
It should be:
["First sentence.", "Second sentence?", "Third sentence!", "", "Fourth sentence.", "", "", "Fifth sentence"]
Extra considerations:
Any ideas? Any approach is welcome (not split() necessarily) - Thanks!
Upvotes: 3
Views: 2338
Reputation: 140210
var re = /[^\r\n.!?]+(:?(:?\r\n|[\r\n]|[.!?])+|$)/gi;
("First sentence.. Second sentence?? Third sentence!!\n"+ "Fourth sentence").match(re).map($.trim)
//["First sentence..", "Second sentence??", "Third sentence!!", "Fourth sentence"]
Upvotes: 3
Reputation: 92893
Use .match
instead (docs). When you use it with a /.../g
-type regex, it returns an array of all matches. You just need to modify your regex first:
var sentences = $('#text').val().match(/[^\r\n.!?]+(\r\n|\r|\n|[.!?])\s*/gi);
Upvotes: 8
Reputation: 1535
Does this work for your purposes? It looks like you're already using jQuery but if not it should be easy to modify:
var sentences = [];
$.each($('#text').val().split(/([^\.\?\!\r\n]+.)\s/gi), function(i, sentence) {
if(i%2 !== 0) {
sentences.push(sentence)
}
});
// sentences = ["First sentence.", "Second sentence?", "Third sentence!", "Fourth sentence."]
Edit: Blazemonger's solution is similar but more elegant, using match() instead of split() and therefore not needing the second step of removing the odd elements in the array.
Upvotes: 1
Reputation: 816364
It would be easy with look-behinds, but since JavaScript does not support it, my suggestion would be:
Find the white space characters you want to split on and replace them with some dummy character. Then split on that character.
Something like:
$('#text').val().replace(/\r\n|\r|\n|([.!?])\s/gi, '$1\0').split(/\0/g);
Edit: Apparently there are better solutions which don't rely on split. I will leave this as alternative however.
Upvotes: 0
Reputation: 3718
what about
var sentences = $('#text').val().split(/\r\n|\r|\n|\s/gi);
Upvotes: 0