Aleksa Arsić
Aleksa Arsić

Reputation: 534

Parse string with regex and get desired output

I want to parse this string

[[delay-4]]Welcome! [[delay-2]]Do you have some questions for us?[[delay-1]] Please fill input field!

I need to get something like this:

[
    [0] => '[[delay-4]]Welcome!',
    [1] => '[[delay-2]]Do you have some questions for us?',
    [2] => '[[delay-1]] Please fill input field!
];

String can also be something like this (without [[delay-4]] on beginning):

Welcome! [[delay-2]]Do you have some questions for us?[[delay-1]] Please fill input field!

Expected output should be something like this:

    [
        [0] => 'Welcome!',
        [1] => '[[delay-2]]Do you have some questions for us?',
        [2] => '[[delay-1]] Please fill input field!
    ];

I tried with this regex (https://regex101.com/r/Eqztl1/1/)

(?:\[\[delay-\d+]])?([\w \\,?!.@#$%^&*()|`\]~\-='\"{}]+)

But I have problem with that regex if someone writes just one [ in text, regex fails and if I include [ to match I got wrong results.

Can anyone help me with this?

Upvotes: 1

Views: 162

Answers (3)

Dean Taylor
Dean Taylor

Reputation: 41981

Two simpler actions might be the route to get the result:

$result = preg_replace('/\s*(\[\[delay-\d+]])/i', "\n$1", $subject);
$result = preg_split('/\r?\n/i', $result, -1, PREG_SPLIT_NO_EMPTY);

Can be seen running here: https://ideone.com/Z5tZI3 and here: https://ideone.com/vnSNYI

This assumes that newline characters don't have special meaning and are OK to split on.


UPDATE: As noted in the comments below it's possible with a single split.

$result = preg_split('/(?=\[\[delay-\d+]])/i', $subject, -1, PREG_SPLIT_NO_EMPTY);

But there are possible issues with zero-length matches and regular expressions, you would have to do your own research on that.

Upvotes: 2

The fourth bird
The fourth bird

Reputation: 163217

In your pattern

(?:[[delay-\d+]])?([\w \,?!.@#$%^&*()|`]~-='\"{}]+)

there is no opening [ in the character class. The problem is that if you add it, you get as you say wrong results.

That is because after matching after matching delay, the character class in the next part which now contains the [ can match the rest of the characters including those of the delay part.

What you could do is to add [ and make the match non greedy in combination with a positive lookahead to assert either the next match for the delay part or the end of the string to also match the last instance.

If you are not using the capturing group and only want the result you can omit it.

(?:\[\[delay-\d+]])?[\w \\,?!.@#$%^&*()|`[\]~\-='\"{}]+?(?=\[\[delay-\d+]]|$)

Regex demo | Php demo

Upvotes: 1

Andreas
Andreas

Reputation: 23958

You can do that without regex too.

Explode on [[ and loop the array. If the start of the item is "delay" then add [[

$str = '[[delay-4]]Welcome! [[delay-2]]Do you have some questions for us?[[delay-1]] Please fill input field!';

$arr = array_filter(explode("[[", $str));

foreach($arr as &$val){
    if(substr($val,0,5) == "delay") $val = "[[" . $val;
}

var_dump($arr);

https://3v4l.org/sIui1

Upvotes: 1

Related Questions