fluminis
fluminis

Reputation: 4079

Javascript regex find variables in a math equation

I want to find in a math expression elements that are not wrapped between { and }

Examples:

Rules

So far, I ended with: http://regex101.com/r/gU0dO4

(^[^/*+({})-]+|(?:[/*+({})-])[^/*+({})-]+(?:[/*+({})-])|[^/*+({})-]+$)

I split the task into 3:

But it doesn't work as expected.

Any idea ?

Upvotes: 3

Views: 3221

Answers (4)

M'vy
M'vy

Reputation: 5774

The variable names you mentioned can be match by \b[\w.]+\b since they are strictly bounded by word separators

Since you have well formed formulas, the names you don't want to capture are strictly followed by }, therefore you can use a lookahead expression to exclude these :

(\b[\w.]+ \b)(?!})

Will match the required elements (http://regexr.com/38rch).

Edit:

For more complex uses like correctly matching :

  • abc {def{}}
  • abc def+(1.1+{g{h}i})

We need to change the lookahead term to (?|({|}))

To include the match of 1.2-{abc def} we need to change the \b1. This term is using lookaround expression which are not available in javascript. So we have to work around.

(?:^|[^a-zA-Z0-9. ])([a-zA-Z0-9. ]+(?=[^0-9A-Za-z. ]))(?!({|}))

Seems to be a good one for our examples (http://regex101.com/r/oH7dO1).

1 \b is the separation between a \w and a \W \z or \a. Since \w does not include space and \W does, it is incompatible with the definition of our variable names.

Upvotes: 1

gog
gog

Reputation: 11347

This might be an interesting regexp challenge, but in the real world you'd be much better off simply finding all [^+/*()-]+ groups and removing those enclosed in {}'s

"abc def+(1.1+{ghi})".match(/[^+/*()-]+/g).filter(
    function(x) { return !/^{.+?}$/.test(x) })
// ["abc def", "1.1"]

That being said, regexes is not a correct way to parse math expressions. For serious parsing, consider using formal grammars and parsers. There are plenty of parser generators for javascript, for example, in PEG.js you can write a grammar like

expr
  = left:multiplicative "+" expr
  / multiplicative

multiplicative
  = left:primary "*" right:multiplicative
  / primary

primary
  = atom
  / "{" expr "}"
  / "(" expr ")"

atom = number / word

number = n:[0-9.]+ { return parseFloat(n.join("")) }
word = w:[a-zA-Z ]+ { return w.join("") }

and generate a parser which will be able to turn

 abc def+(1.1+{ghi})

into

[
   "abc def",
   "+",
   [
      "(",
      [
         1.1,
         "+",
         [
            "{",
            "ghi",
            "}"
         ]
      ],
      ")"
   ]
]

Then you can iterate this array just normally and fetch the parts you're interested in.

Upvotes: 2

Benjamin Gruenbaum
Benjamin Gruenbaum

Reputation: 276406

Matching {}s, especially nested ones is hard (read impossible) for a standard regular expression, since it requires counting the number of {s you encountered so you know which } terminated it.

Instead, a simple string manipulation method could work, this is a very basic parser that just reads the string left to right and consumes it when outside of parentheses.

var input = "abc def+(1.1+{ghi})"; // I assume well formed, as well as no precedence
var inParens = false;
var output = [], buffer = "", parenCount = 0;
for(var i = 0; i < input.length; i++){
    if(!inParens){
          if(input[i] === "{"){
              inParens = true;
              parenCount++;
          } else if (["+","-","(",")","/","*"].some(function(x){ 
               return x === input[i]; 
          })){ // got symbol
              if(buffer!==""){ // buffer has stuff to add to input
                  output.push(buffer); // add the last symbol
                  buffer = "";
              }
          } else { // letter or number
              buffer += input[i]; // push to buffer
          }
    } else { // inParens is true
         if(input[i] === "{") parenCount++;
         if(input[i] === "}") parenCount--;
         if(parenCount === 0) inParens = false; // consume again
    }
}

Upvotes: 3

Amit Joki
Amit Joki

Reputation: 59262

Going forward with user2864740's comment, you can replace all things between {} with empty and then match the remaining.

var matches = "string here".replace(/{.+?}/g,"").match(/\b[\w. ]+\b/g);

Since you know that expressions are valid, just select \w+

Upvotes: 0

Related Questions