konrados
konrados

Reputation: 1141

PHP - how to explode a string using a comma, except situtation when this comma is inside apostrophes?

I have the following text:

$string='
            blah<br>
            @include (\'file_to_load\')
            <br>
            @include (\'file_to_load\',\'param1\',\'param2\',\'param3\')
    ';

I'd like to catch (and then replace using preg_replace_callback) all the occurences of "@include" with parameters (e.g. @include ('file_to_load','param1','param2','param3') )

So I do this:

$string='
 blah<br>
 @include (\'file_to_load\')
 <br>
 @include (\'file_to_load\',\'param1\',\'param2\')
';
$params=[];
$result = preg_replace_callback(
    '~@include \((,?.*?)\)~',//I catch @include, parenthesis and all between them
    function ($matches) {
        echo '---iteration---';
        $params=explode(',',$matches[1]);//exploding by a comma
        echo '<pre>';
        var_dump($params);
        echo '</pre>';
        return $matches[1];
    },
    $string
);

And everything's fine until a comma appears inside a parameter, like here:

$string='
    blah<br>
    @include (\'file_to_load\')
    <br>
    @include (\'file_to_load\',\'param1,something\',[\'elem\'=>\'also, a comma\']])
';

Here we have a comma inside a "param1" param, now, after exploding with the explode() function it obviously doesn't work like I want.

I there a way to explode() (by using regular expression probably) the string by a comma, but not when the comma is inside apostrophes?

Upvotes: 3

Views: 563

Answers (3)

Lucas Trzesniewski
Lucas Trzesniewski

Reputation: 51390

What you're looking for is tokenization. Don't try to split on the commas. Instead, identify each building block of your expression. So you need matching, not splitting.

For instance, this simple regex:

'[^']+'

Will match these elements:

@include ('file_to_load','param1,something',['elem'=>'also, a comma'])
          \____________/ \________________/  \____/  \_____________/

But it may not be sufficient for your case, since you have an array in there, and I assume you have to parse it as well.

So identify each parameter separately:

'[^']+'|\[.+?\]
@include ('file_to_load','param1,something',['elem'=>'also, a comma'])
          \____________/ \________________/ \_______________________/

The issue with this approach is that it won't allow you to match nested arrays. If you need to be able to parse that, then the pattern gets more complicated:

(?(DEFINE)
  (?<string>'[^']+')
  (?<array> \[ (?: (?&arrayitem) (?> , \s* (?&arrayitem) )* )? \] )
  (?<arrayitem> \s* (?&string) \s* => \s* (?&value) \s* )
  (?<value> (?&string) | (?&array) )
)
(?&value)

Yeah, that's a recursive regex but it can actually identify the parameters:

@include ('file_to_load','param1,something',['elem'=>'also, a comma','other'=>['nested' => 'array']])
          \___________/  \________________/ \______________________________________________________/

Demo

As I don't know what you're trying to do with the parameters afterwards, you may actually need to write a parser instead of using regular expressions, but that depends on what you'll be trying to do once you split the parameters.

Side note: You may need to replace the '[^']+' string pattern with something a bit more complicated if you want to be able to escape a quote inside the string.

There are two widely-accepted ways to do this:

  • Use a backslash: 'abc\'def'

    '(?:[^\\']++|\\.)*'
    
  • Double the quote: 'abc''def'

    '(?:[^']++|'')*'
    

Upvotes: 2

karthik manchala
karthik manchala

Reputation: 13640

Use the following to split:

,(?=([^']*'[^']*')*[^']*$)

Use preg_split since explode does not support regex:

Code:

$params = preg_split(',(?=([^']*'[^']*')*[^']*$)',$matches[1]);

Upvotes: 2

Christian Ezeani
Christian Ezeani

Reputation: 350

Try using this:

"\@include[\s]*\([^\)]*\)"

This will match

@include (\'file_to_load\')

and

@include (\'file_to_load\',\'param1,something\',[\'elem\'=>\'also, a comma\']])

I hope this helps.

Upvotes: 0

Related Questions