Venkat
Venkat

Reputation: 105

Split paragraph into sentences when paragraph ends with quotes using Javascript

I am trying to split the whole paragraph into a sentences using Javascript regular expressions.

Paragraph:

I visited a bar in Kansas. At the entrance I see, "Welcome to the bar!" While leaving that place I see message, "Good night!" I wondered how they changed the name.

I want to split the above paragraph into sentences.

  1. I visited a bar in Kansas.
  2. At the entrance I see, "Welcome to the bar!"
  3. While leaving that place I see message, "Good night!"
  4. I wondered how they changed the name. (There is a line break(<br>) between "Good night!" and I wondered how..)

Currently I am using the regular expression

var reg= /(\S.+?[.!?"'] | [.!?] + ["'!.?])(?=\s+[A-Z]|[^<br>]|$)/g;

but it is not treating the line break(<br>) as a separate sentence. It is splitting the words into

  1. I visited a bar in Kansas.
  2. At the entrance I see, "Welcome to the bar!"
  3. While leaving that place I see message, "Good night!" I wondered how they changed the name.

To create the line break needs to enter Shift+Enter key.

Upvotes: 0

Views: 1585

Answers (1)

skamazin
skamazin

Reputation: 757

I'm not certain that I understand exactly what you need but this regex should do the trick

var re = /(\w[^.!?]+[.!?]+"?)\s?/g;

You can see the matches here (note the g for global on right side of the regex). I believe it properly splits the matches based on what you want. Let me know if there's a problem.

The code should be something along the lines of (taken directly from http://regex101.com)

var re = /([^.!?]+[.!?]"?)\s?/g; 
var str = 'I visited a bar in Kansas. At the entrance I see, "Welcome to the bar!" While leaving that place I see message, "Good night!"\nI wondered how they changed the name.';
var m;

while ((m = re.exec(str)) != null) {
    if (m.index === re.lastIndex) {
        re.lastIndex++;
    }
    // View your result using the m-variable.
    // eg m[0] etc.
}

Upvotes: 1

Related Questions