Reputation: 1141
I have the following text:
$string='
blah<br>
@include (\'file_to_load\')
<br>
@include (\'file_to_load\',\'param1\',\'param2\',\'param3\')
';
I'd like to catch (and then replace using preg_replace_callback) all the occurences of "@include" with parameters (e.g. @include ('file_to_load','param1','param2','param3') )
So I do this:
$string='
blah<br>
@include (\'file_to_load\')
<br>
@include (\'file_to_load\',\'param1\',\'param2\')
';
$params=[];
$result = preg_replace_callback(
'~@include \((,?.*?)\)~',//I catch @include, parenthesis and all between them
function ($matches) {
echo '---iteration---';
$params=explode(',',$matches[1]);//exploding by a comma
echo '<pre>';
var_dump($params);
echo '</pre>';
return $matches[1];
},
$string
);
And everything's fine until a comma appears inside a parameter, like here:
$string='
blah<br>
@include (\'file_to_load\')
<br>
@include (\'file_to_load\',\'param1,something\',[\'elem\'=>\'also, a comma\']])
';
Here we have a comma inside a "param1" param, now, after exploding with the explode() function it obviously doesn't work like I want.
I there a way to explode() (by using regular expression probably) the string by a comma, but not when the comma is inside apostrophes?
Upvotes: 3
Views: 563
Reputation: 51390
What you're looking for is tokenization. Don't try to split on the commas. Instead, identify each building block of your expression. So you need matching, not splitting.
For instance, this simple regex:
'[^']+'
Will match these elements:
@include ('file_to_load','param1,something',['elem'=>'also, a comma'])
\____________/ \________________/ \____/ \_____________/
But it may not be sufficient for your case, since you have an array in there, and I assume you have to parse it as well.
So identify each parameter separately:
'[^']+'|\[.+?\]
@include ('file_to_load','param1,something',['elem'=>'also, a comma'])
\____________/ \________________/ \_______________________/
The issue with this approach is that it won't allow you to match nested arrays. If you need to be able to parse that, then the pattern gets more complicated:
(?(DEFINE)
(?<string>'[^']+')
(?<array> \[ (?: (?&arrayitem) (?> , \s* (?&arrayitem) )* )? \] )
(?<arrayitem> \s* (?&string) \s* => \s* (?&value) \s* )
(?<value> (?&string) | (?&array) )
)
(?&value)
Yeah, that's a recursive regex but it can actually identify the parameters:
@include ('file_to_load','param1,something',['elem'=>'also, a comma','other'=>['nested' => 'array']])
\___________/ \________________/ \______________________________________________________/
As I don't know what you're trying to do with the parameters afterwards, you may actually need to write a parser instead of using regular expressions, but that depends on what you'll be trying to do once you split the parameters.
Side note: You may need to replace the '[^']+'
string pattern with something a bit more complicated if you want to be able to escape a quote inside the string.
There are two widely-accepted ways to do this:
Use a backslash: 'abc\'def'
'(?:[^\\']++|\\.)*'
Double the quote: 'abc''def'
'(?:[^']++|'')*'
Upvotes: 2
Reputation: 13640
Use the following to split:
,(?=([^']*'[^']*')*[^']*$)
Use preg_split
since explode
does not support regex:
Code:
$params = preg_split(',(?=([^']*'[^']*')*[^']*$)',$matches[1]);
Upvotes: 2
Reputation: 350
Try using this:
"\@include[\s]*\([^\)]*\)"
This will match
@include (\'file_to_load\')
and
@include (\'file_to_load\',\'param1,something\',[\'elem\'=>\'also, a comma\']])
I hope this helps.
Upvotes: 0