Prashanth Benny
Prashanth Benny

Reputation: 1609

Splitting paragraph into sentences keeping the punctuations - not a dup

Here us a point i am stuck again using regular expression with PHP preg_split() function.

Here is the code :

preg_split('~("[^"]*")|[!?.।]+\s*|\R+~u', $paragraph, -1, PREG_SPLIT_DELIM_CAPTURE | PREG_SPLIT_NO_EMPTY);

I am trying to split a paragraph into sentences. This code does the job for me.
here is a link to my previous question

But, now I need to keep the punctuation intact(the question marks, full stop etc.).

using the PREG_SPLIT_DELIM_CAPTURE is supposed to have done that job but somehow it's not working that way. I get only sentences, without the full-stop or question marks.

Upvotes: 1

Views: 164

Answers (1)

revo
revo

Reputation: 48711

Your requirement doesn't need PREG_SPLIT_DELIM_CAPTURE. It's helpful when you need them to be returned as individual matches. In this case you need \K:

<?php

var_dump(preg_split('~("[^"]*")|[!?.।]+\K\s*|\R+~u', <<<STR
hello! how are you? how is life
live life, live free. "isnt it?"
STR
, -1, PREG_SPLIT_DELIM_CAPTURE | PREG_SPLIT_NO_EMPTY));

Output:

array(5) {
  [0]=>
  string(6) "hello!"
  [1]=>
  string(12) "how are you?"
  [2]=>
  string(11) "how is life"
  [3]=>
  string(21) "live life, live free."
  [4]=>
  string(10) ""isnt it?""
}

Upvotes: 3

Related Questions