Andres SK
Andres SK

Reputation: 10974

Including separator characters in split (javascript)

This is a textarea. The user can write anything.

<textarea id="text">First sentence. Second sentence? Third sentence!
Fourth sentence.

Fifth sentence
</textarea>

At the end, i have to split all the text into an array.

var sentences = $('#text').val().split(/\r\n|\r|\n|[.|!|?]\s/gi);

The issue i'm having, is that the separator characters are not present in the array item values. This is what sentences is returning:

["First sentence", "Second sentence", "Third sentence", "Fourth sentence", "Fifth sentence"]

It should be:

["First sentence.", "Second sentence?", "Third sentence!", "", "Fourth sentence.", "", "", "Fifth sentence"]

Extra considerations:

Any ideas? Any approach is welcome (not split() necessarily) - Thanks!

Upvotes: 3

Views: 2338

Answers (5)

Esailija
Esailija

Reputation: 140210

var re = /[^\r\n.!?]+(:?(:?\r\n|[\r\n]|[.!?])+|$)/gi;
("First sentence.. Second sentence?? Third sentence!!\n"+ "Fourth sentence").match(re).map($.trim)
//["First sentence..", "Second sentence??", "Third sentence!!", "Fourth sentence"]

Upvotes: 3

Blazemonger
Blazemonger

Reputation: 92893

Use .match instead (docs). When you use it with a /.../g-type regex, it returns an array of all matches. You just need to modify your regex first:

var sentences = $('#text').val().match(/[^\r\n.!?]+(\r\n|\r|\n|[.!?])\s*/gi);

http://jsfiddle.net/kEHhA/3/

Upvotes: 8

Cecchi
Cecchi

Reputation: 1535

Does this work for your purposes? It looks like you're already using jQuery but if not it should be easy to modify:

var sentences = [];
$.each($('#text').val().split(/([^\.\?\!\r\n]+.)\s/gi), function(i, sentence) {
  if(i%2 !== 0) {
    sentences.push(sentence)
  }
});
// sentences = ["First sentence.", "Second sentence?", "Third sentence!", "Fourth sentence."]

Edit: Blazemonger's solution is similar but more elegant, using match() instead of split() and therefore not needing the second step of removing the odd elements in the array.

Upvotes: 1

Felix Kling
Felix Kling

Reputation: 816364

It would be easy with look-behinds, but since JavaScript does not support it, my suggestion would be:

Find the white space characters you want to split on and replace them with some dummy character. Then split on that character.

Something like:

$('#text').val().replace(/\r\n|\r|\n|([.!?])\s/gi, '$1\0').split(/\0/g);​​​​​

Edit: Apparently there are better solutions which don't rely on split. I will leave this as alternative however.

Upvotes: 0

duffy356
duffy356

Reputation: 3718

what about

var sentences = $('#text').val().split(/\r\n|\r|\n|\s/gi);

Upvotes: 0

Related Questions