Reputation: 19
I've been struggling with this for quite awhile (not being a regex ninja), searching stackoverflow and through trial an error. I think I'm close, but there are still a few hiccups that I need help sorting out.
The requirements are such that a given equation, that includes variables, exponents, etc, are split by the regex pattern after variables, constants, values, etc. What I have so far
Regex re = new Regex(@"(\,|\(|\)|(-?\d*\.?\d+e[+-]?\d+)|\+|\-|\*|\^)");
var tokens = re.Split(equation)
So an equation such as
2.75423E-19* (var1-5)^(1.17)* (var2)^(1.86)* (var3)^(3.56)
should parse to
[2.75423E-19 ,*, (, var1,-,5, ), ^,(,1.17,),*....,3.56,)]
However the exponent portion is getting split as well which I think is due to the regex portion: |+|-.
Other renditions I've tried are:
Regex re1 = new Regex(@"([\,\+\-\*\(\)\^\/\ ])"); and
Regex re = new Regex(@"(-?\d*\.?\d+e[+-]?\d+)|([\,\+\-\*\(\)\^\/\ ])");
which both have there flaws. Any help would be appreciated.
Upvotes: 2
Views: 1438
Reputation: 626903
For the equations like the one posted in the original question, you can use
[0-9]*\.?[0-9]+([eE][-+]?[0-9]+)?|[-^+*/()]|\w+
See regex demo
The regex matches:
[0-9]*\.?[0-9]+([eE][-+]?[0-9]+)?
- a float number|
- or...[-^+*/()]
- any of the arithmetic and logical operators present in the equation posted|
- or...\w+
- 1 or more word characters (letters, digits or underscore).For more complex tokenization, consider using NCalc suggested by Lucas Trzesniewski's comment.
var line = "2.75423E-19* (var1-5)^(1.17)* (var2)^(1.86)* (var3)^(3.56)";
var matches = Regex.Matches(line, @"[0-9]*\.?[0-9]+([eE][-+]?[0-9]+)?|[-^+*/()]|\w+");
foreach (Match m in matches)
Console.WriteLine(m.Value);
And updated code for you to show that Regex.Split
is not necessary here:
var result = Regex.Matches(line, @"\d+(?:[,.]\d+)*(?:e[-+]?\d+)?|[-^+*/()]|\w+", RegexOptions.IgnoreCase)
.Cast<Match>()
.Select(p => p.Value)
.ToList();
Also, to match formatted numbers, you can use \d+(?:[,.]\d+)*
rather than [0-9]*\.?[0-9]+
or \d+(,\d+)*
.
Upvotes: 4
Reputation: 19
So I think I've got a solution thanks to @stribizhev solution lead me to the regex solution
Regex re = new Regex(@"(\d+(,\d+)*(?:.\d+)?(?:[eE][-+]?[0-9]+)?|[-^+/()]|\w+)");
tokenList = re.Split(InfixExpression).Select(t => t.Trim()).Where(t => t != "").ToList();
When split gives me the desired array.
Upvotes: -1