Julien
Julien

Reputation: 27

Split a sentence into parts based on punctuation

I have spent the last hour looking for replies but I haven't found any yet, so here I ask...

I need a way (certainly regex, but everything else like explode is fine) to cut a sentence like the following into parts, in the same array:

This is the first part, this is the second part; this is the third part! this is the fourth part? again - and again - until the sentence is over.

I want an array with the following entries (without the spaces following or preceding or not the punctuation marks, please):

EDIT: Sorry, the following example is in English but it should be able to handle a whole variety of scripts (all of Unicode, basically).

Thanks a lot!

Upvotes: 0

Views: 1458

Answers (3)

anubhava
anubhava

Reputation: 785058

A single preg_split can do the job:

$s = 'This is the first part, this is the second part; this is the third part! this is the fourth part? again - and again - until the sentence is over.';
print_r(preg_split('/\s*[,:;!?.-]\s*/u', $s, -1, PREG_SPLIT_NO_EMPTY));

OUTPUT:

Array
(
    [0] => This is the first part
    [1] => this is the second part
    [2] => this is the third part
    [3] => this is the fourth part
    [4] => again
    [5] => and again
    [6] => until the sentence is over
)

Upvotes: 1

Nauphal
Nauphal

Reputation: 6192

I found a solution here

Here is my approach to have exploded output with multiple delimiter.

<?php

//$delimiters has to be array
//$string has to be array

function multiexplode ($delimiters,$string) {

    $ready = str_replace($delimiters, $delimiters[0], $string);
    $launch = explode($delimiters[0], $ready);
    return  $launch;
}

$text = "here is a sample: this text, and this will be exploded. this also | this one too :)";
$exploded = multiexplode(array(",",".","|",":"),$text);

print_r($exploded);

//And output will be like this:
// Array
// (
//    [0] => here is a sample
//    [1] =>  this text
//    [2] =>  and this will be exploded
//    [3] =>  this also
//    [4] =>  this one too
//    [5] => )
// )

?>

Upvotes: 1

Mina
Mina

Reputation: 1516

Try using this

$parts = preg_split("/[^A-Z\s]+/i", $string);
var_dump($parts);

Upvotes: 0

Related Questions