I want to find in a math expression elements that are not wrapped between { and } Examples: Input: abc+1*def Matches: ["abc", "1", "def"] Input: {abc}+1+def Matches: ["1", "def"] Input: abc+(1+def) Matches: ["abc", "1", "def"] Input: abc+(1+{def}) Matches: ["abc", "1"] Input: abc def+(1.1+{ghi}) Matches: ["abc def", "1.1"] Input: 1.1-{abc def} Matches: ["1.1"] Rules The expression is well-formed. (So there won't be start parenthesis without closing parenthesis or starting { without } ) The math symbols allowed in the expression are + - / * and ( ) Numbers could be decimals. Variables could contains spaces. Only one level of { } (no nested brackets) So far, I ended with: http://regex101.com/r/gU0dO4 (^[^/*+({})-]+|(?:[/*+({})-])[^/*+({})-]+(?:[/*+({})-])|[^/*+({})-]+$) I split the task into 3: match elements at the beginning of the string match elements that are between two { and } match elements at the end of the string But it doesn't work as expected. Any idea ?

javascriptregex

fluminis

Reputation: 4079

Javascript regex find variables in a math equation

I want to find in a math expression elements that are not wrapped between { and }

Examples:

Input: abc+1*def
Matches: ["abc", "1", "def"]
Input: {abc}+1+def
Matches: ["1", "def"]
Input: abc+(1+def)
Matches: ["abc", "1", "def"]
Input: abc+(1+{def})
Matches: ["abc", "1"]
Input: abc def+(1.1+{ghi})
Matches: ["abc def", "1.1"]
Input: 1.1-{abc def}
Matches: ["1.1"]

Rules

The expression is well-formed. (So there won't be start parenthesis without closing parenthesis or starting { without })
The math symbols allowed in the expression are + - / * and ( )
Numbers could be decimals.
Variables could contains spaces.
Only one level of { } (no nested brackets)

So far, I ended with: http://regex101.com/r/gU0dO4

(^[^/*+({})-]+|(?:[/*+({})-])[^/*+({})-]+(?:[/*+({})-])|[^/*+({})-]+$)

I split the task into 3:

match elements at the beginning of the string
match elements that are between two { and }
match elements at the end of the string

But it doesn't work as expected.

Any idea ?

Upvotes: 3

Answers (4)

M'vy

Reputation: 5774

The variable names you mentioned can be match by \b[\w.]+\b since they are strictly bounded by word separators

Since you have well formed formulas, the names you don't want to capture are strictly followed by }, therefore you can use a lookahead expression to exclude these :

(\b[\w.]+ \b)(?!})

Will match the required elements (http://regexr.com/38rch).

Edit:

For more complex uses like correctly matching :

abc {def{}}
abc def+(1.1+{g{h}i})

We need to change the lookahead term to (?|({|}))

To include the match of 1.2-{abc def} we need to change the \b¹. This term is using lookaround expression which are not available in javascript. So we have to work around.

(?:^|[^a-zA-Z0-9. ])([a-zA-Z0-9. ]+(?=[^0-9A-Za-z. ]))(?!({|}))

Seems to be a good one for our examples (http://regex101.com/r/oH7dO1).

¹ \b is the separation between a \w and a \W \z or \a. Since \w does not include space and \W does, it is incompatible with the definition of our variable names.

Upvotes: 1

gog

Reputation: 11347

This might be an interesting regexp challenge, but in the real world you'd be much better off simply finding all [^+/*()-]+ groups and removing those enclosed in {}'s

"abc def+(1.1+{ghi})".match(/[^+/*()-]+/g).filter(
    function(x) { return !/^{.+?}$/.test(x) })
// ["abc def", "1.1"]

That being said, regexes is not a correct way to parse math expressions. For serious parsing, consider using formal grammars and parsers. There are plenty of parser generators for javascript, for example, in PEG.js you can write a grammar like

expr
  = left:multiplicative "+" expr
  / multiplicative

multiplicative
  = left:primary "*" right:multiplicative
  / primary

primary
  = atom
  / "{" expr "}"
  / "(" expr ")"

atom = number / word

number = n:[0-9.]+ { return parseFloat(n.join("")) }
word = w:[a-zA-Z ]+ { return w.join("") }

and generate a parser which will be able to turn

 abc def+(1.1+{ghi})

into

[
   "abc def",
   "+",
   [
      "(",
      [
         1.1,
         "+",
         [
            "{",
            "ghi",
            "}"
         ]
      ],
      ")"
   ]
]

Then you can iterate this array just normally and fetch the parts you're interested in.

Upvotes: 2

Benjamin Gruenbaum

Reputation: 276406

Matching {}s, especially nested ones is hard (read impossible) for a standard regular expression, since it requires counting the number of {s you encountered so you know which } terminated it.

Instead, a simple string manipulation method could work, this is a very basic parser that just reads the string left to right and consumes it when outside of parentheses.

var input = "abc def+(1.1+{ghi})"; // I assume well formed, as well as no precedence
var inParens = false;
var output = [], buffer = "", parenCount = 0;
for(var i = 0; i < input.length; i++){
    if(!inParens){
          if(input[i] === "{"){
              inParens = true;
              parenCount++;
          } else if (["+","-","(",")","/","*"].some(function(x){ 
               return x === input[i]; 
          })){ // got symbol
              if(buffer!==""){ // buffer has stuff to add to input
                  output.push(buffer); // add the last symbol
                  buffer = "";
              }
          } else { // letter or number
              buffer += input[i]; // push to buffer
          }
    } else { // inParens is true
         if(input[i] === "{") parenCount++;
         if(input[i] === "}") parenCount--;
         if(parenCount === 0) inParens = false; // consume again
    }
}

Upvotes: 3

Amit Joki

Reputation: 59262

Going forward with user2864740's comment, you can replace all things between {} with empty and then match the remaining.

var matches = "string here".replace(/{.+?}/g,"").match(/\b[\w. ]+\b/g);

Since you know that expressions are valid, just select \w+

Upvotes: 0

Javascript regex find variables in a math equation

Answers (4)

Related Questions