Acsrel
Acsrel

Reputation: 35

Split mathematic expression into array without splitting subexpressions between parentheses and single quotes

Let's say I have this string:

1 + 2 * (3 + (23 + 53 - (132 / 5) + 5) - 1) + 2 / 'test + string' - 52

I want to split it into an array of operators and non-operators, but anything between the () and ' must not be split.

I want the output to be:

[1, "+", 2, "*", "(3 + (23 + 53 - (132 / 5) + 5) - 1)", "+", 2, "/", "'test + string'", "-", 52]

I'm using this code:

preg_split("~['\(][^'()]*['\)](*SKIP)(*F)|([+\-*/^])+~", $str, -1, PREG_SPLIT_DELIM_CAPTURE);

The technique does what I want with the operators and the ', but not for (). However it only keeps (132 / 5) (the deepest nested parenthetical expression) and splits all the other ones, giving me this output:

[1, "+", 2, "*", "(3", "+", "(23", "+", 53, "-", "(132 / 5)", "+", "5)", "-", "1)", "+", 2, "/", "'test + string'", "-", 52]

How can I ensure that the outermost parenthetical expression and all of its contents remain together?

Upvotes: 2

Views: 95

Answers (2)

mickmackusa
mickmackusa

Reputation: 47991

I do like @thefourthbird's recursive subpattern, but I would prefer to standardize the output elements so that all whitespace is removed.

I won't use delimiter capturing or skip-fail, but fullstring restarts (\K) to omit the spaces.

Code: (Demo)

preg_split(
    "~(?:(\((?:[^()]+|(?1))*\))|'[^']*'|[\d.]+|[*/^+-])\K ?~",
    $str,
    -1,
    PREG_SPLIT_NO_EMPTY
)

I have done similar techniques on SO like this one. Another consideration is: how do you want to handle signed numbers? Should the numberic entity retain the sign symbol or should it be separated as if it were an operator?

Upvotes: 2

The fourth bird
The fourth bird

Reputation: 163477

You might use a pattern to recurse the first sub pattern matching balanced parenthesis and then use the SKIP FAIL. After the alternation you can still use the capture group, which will be group 2 and the values will be kept due to the PREG_SPLIT_DELIM_CAPTURE flag.

To remove the empty entries, you can add the PREG_SPLIT_NO_EMPTY flag.

(?:(\((?:[^()]++|(?1))*\))|'[^']*')(*SKIP)(*F)|([+\-*/^])

Regex demo

$str = "1 + 2 * (3 + (23 + 53 - (132 / 5) + 5) - 1) + 2 / 'test + string' - 52";
$result = preg_split("~(?:(\((?:[^()]++|(?1))*\))|'[^']*')(*SKIP)(*F)|([+\-*/^])~", $str, -1, PREG_SPLIT_DELIM_CAPTURE | PREG_SPLIT_NO_EMPTY);

print_r($result);

Output

Array
(
    [0] => 1 
    [1] => +
    [2] =>  2 
    [3] => *
    [4] =>  (3 + (23 + 53 - (132 / 5) + 5) - 1) 
    [5] => +
    [6] =>  2 
    [7] => /
    [8] =>  'test + string' 
    [9] => -
    [10] =>  52
)

Upvotes: 3

Related Questions