JoeC
JoeC

Reputation: 367

Retain Delimiters when Splitting String

Edit: OK, I can't read, thanks to Col. Shrapnel for the help. If anyone comes here looking for the same thing to be answered... print_r(preg_split('/([\!|\?|\.|\!\?])/', $string, null, PREG_SPLIT_DELIM_CAPTURE));

Is there any way to split a string on a set of delimiters, and retain the position and character(s) of the delimiter after the split?

For example, using delimiters of ! ? . !? turning this:

$string = 'Hello. A question? How strange! Maybe even surreal!? Who knows.';

into this

array('Hello', '.', 'A question', '?', 'How strange', '!', 'Maybe even surreal', '!?', 'Who knows', '.');

Currently I'm trying to use print_r(preg_split('/([\!|\?|\.|\!\?])/', $string)); to capture the delimiters as a subpattern, but I'm not having much luck.

Upvotes: 0

Views: 261

Answers (4)

mickmackusa
mickmackusa

Reputation: 47864

From PHP8.1, it is no longer permitted to use null as the limit parameter for preg_split() because an integer is expected. When seeking unlimited output elements from the return value, it is acceptable to use 0 or -1. (Demo)

To avoid empty elements in the returned array, I recommend PREG_SPLIT_NO_EMPTY as an additional flag. (Demo)

var_export(
    preg_split(
        '/(!\?|[!?.])/',
        $string,
        0,
        PREG_SPLIT_DELIM_CAPTURE | PREG_SPLIT_NO_EMPTY
    )
);

Since PHP8, it is technically possible to omit the limit parameter and declare flags by using named parameters.

Upvotes: 1

Pindatjuh
Pindatjuh

Reputation: 10526

You can also split on the space after a ., !, ? or !?. But this can only be used if you can guarantee that there is a space after such a character.

You can do this, by matching a but with a positive look-back: (<=\.|!?|?|!): this makes the regex

'/(?<=\.|\?|!) /'

And then, you'll have to check if the strings matched ends with !?: if so, substring the last two. If not, you'll have to substring the last character.

Upvotes: 0

Chad Birch
Chad Birch

Reputation: 74528

Your comment sounds like you've found the relevant flag, but your regex was a little off, so I'm going to add this anyway:

preg_split('/(!\?|[!?.])/', $string, null, PREG_SPLIT_DELIM_CAPTURE);

Note that this will leave spaces at the beginning of every string after the first, so you'll probably want to run them all through trim() as well.

Results:

$string = 'Hello. A question? How strange! Maybe even surreal!? Who knows.';
print_r(preg_split('/(!\?|[!?.])/', $string, null, PREG_SPLIT_DELIM_CAPTURE));

Array
(
    [0] => Hello
    [1] => .
    [2] =>  A question
    [3] => ?
    [4] =>  How strange
    [5] => !
    [6] =>  Maybe even surreal
    [7] => !?
    [8] =>  Who knows
    [9] => .
    [10] => 
)

Upvotes: 1

ircmaxell
ircmaxell

Reputation: 165191

Simply add the PREG_SPLIT_DELIM_CAPTURE to the preg_split function:

$str = 'Hello. A question? How strange!';
$var = preg_split('/([!?.])/', $str, 0, PREG_SPLIT_DELIM_CAPTURE);
$var = array(
    0 => "Hello",
    1 => ".",
    2 => " A question",
    3 => "?",
    4 => " How strange",
    5 => "!",
    6 => "",
);

Upvotes: 0

Related Questions