SPlatten
SPlatten

Reputation: 5762

How to split expression containing brackets correctly

I am trying to write an expression handler that will correctly split brackets, until today it has worked very well, but I've now encountered a problem I hadn't thought of.

I try to split the expression by the content of brackets first, once these are evaluated I replace the original content with the results and process until there are no brackets remaining.

The expression may contain marcos/variables. Macros are denoted by text wrapped in $macro$.

A typical expression:

    ($exampleA$ * 3) + ($exampleB$ / 2)

Macros are replaced before the expression is evaluated, the above works fine because the process is as follows:

  1. Split expression by brackets, this results in two expressions:

    $exampleA$ * 3
    $exampleB$ / 2
    
  2. Each expression is then evaluated, if exampleA = 3 and exampleB = 6:

    $exampleA$ * 3 = 3 * 3 = 9
    $exampleB$ / 2 = 6 / 2 = 3
    
  3. The expression is then rebuilt using the results:

    9 + 3
    
  4. The final expression without any brackets is then evaluated to:

    12
    

This works fine until an expressions with nested brackets is used:

    ((($exampleA$ * 3) + ($exampleB$ / 2) * 2) - 1)

This breaks completely because the regular expression I'm using:

    regex("(?<=\\()[^)]*(?=\\))");

Results in:

    ($exampleA$ * 3
    $exampleB$ / 2

So how can I correctly decode this, I want the above to be broken down to:

    $exampleA$ * 3
    $exampleB$ / 2

Upvotes: 1

Views: 139

Answers (4)

BG100
BG100

Reputation: 4531

You can't really do this with regex. You really need a recursive method, like this:

using System;
using System.Data;
using System.Xml;

public class Program
{
    public static void Main() {         

        Console.WriteLine(EvaluateExpression("(1 + 2) * 7"));           

    }


    public static int EvaluateExpression(string expression) {

        // Recursively evaluate parentheses as sub expressions
        var expr = expression.ToLower();
        while (expr.Contains("(")) {

            // Find first opening bracket
            var count = 1;
            var pStart = expr.IndexOf("(", StringComparison.InvariantCultureIgnoreCase);
            var pos = pStart + 1;

            // Find matching closing bracket
            while (pos < expr.Length && count > 0) {
                if (expr.Substring(pos, 1) == "(") count++;
                if (expr.Substring(pos, 1) == ")") count--;
                pos++;
            }

            // Error if no matching closing bracket
            if (count > 0) throw new InvalidOperationException("Closing parentheses not found.");

            // Divide expression into sub expression
            var pre = expr.Substring(0, pStart);
            var subexpr = expr.Substring(pStart + 1, pos - pStart - 2);
            var post = expr.Substring(pos, expr.Length - pos);

            // Recursively evaluate the sub expression
            expr = string.Format("{0} {1} {2}", pre, EvaluateExpression(subexpr), post);

        }

        // Replace this line with you're own logic to evaluate 'expr', a sub expression with any brackets removed.
        return (int)new DataTable().Compute(expr, null);

    }

}

I'm assuming your using C# here... but you should get the idea and be able to translate it into whatever.

Upvotes: 1

Karel Vlk
Karel Vlk

Reputation: 239

I am not exactly sure what you are trying to do. If you want to match the innermost expressions, wouldn't this help?:

regex("(?<=\\()[^()]*(?=\\))");

By the way, are the parentheses in your example unbalanced on purpose?

Upvotes: 1

chaitan64arun
chaitan64arun

Reputation: 783

If you use the following regex, you can capture them as group(1). group(0) will have parenthesis included.

"\\(((?:\"\\(|\\)\"|[^()])+)\\)"

Hope it helps!

Upvotes: 0

0x5453
0x5453

Reputation: 13599

Traditional regex cannot handle recursive structures like nested brackets.

Depending on which regex flavor you are using, you may be able to use regex recursion. Otherwise, you will probably need a new method for parsing the groups. I think the traditional way is to represent the expression as a stack: start with an empty stack, push when you find a '(', pop when you find a ')'.

Upvotes: 1

Related Questions