Xai Nano
Xai Nano

Reputation: 99

Java regex ability to handle nested matches separately

I completely can't figure out how to write pattern, to achieve nested fractions in latex with my regex.

Here is couple sample user inputs:

"fractionx+3over5moveout+2"
"fractionfractionx+1over7moveoutover3moveout+1", 
"fractionfractionfractionx+3over3moveoutoverx+2moveoutover7moveout+1".

Here's my regex code(I made partial versions):

\\Final version with closing "moveout" phrase
regexPattern = Pattern.compile(fraction(?<upper>.*?)over(?<lower>.*?)moveout);
regexMatcher = regexPattern.matcher(userInput);
mathFormulaInLaTeX = regexMatcher.replaceAll(\\frac{${upper}} {${lower}});

\\Starting version without "over" keyword
regexPattern = Pattern.compile(fraction(?<upper>.*));
regexMatcher = regexPattern.matcher(userInput);
mathFormulaInLaTeX = regexMatcher.replaceAll(\\frac{${upper}} {});

With the following inputs I get results:

Input: "fractionx+3over5moveout+2"

(The final regex version works fine with only one fraction) enter image description here

Input: "fractionfractionx+1"

(Starting version without "over" keyword with nested fractions not working properly) enter image description here

Input: "fractionfractionx+1over7moveoutover3moveout+1"

(The final regex version with nested fractions adds moveout word) enter image description here

Input: "fractionfractionfractionx+3over3moveoutoverx+2moveoutover7moveout+1"

(Completly misclassified)

enter image description here

Is there a neat pattern solution for any number of nested fractions, to get rid of "moveout" word displaying and make nested fraction like this:

enter image description here

I appreciate any help.

Upvotes: 0

Views: 80

Answers (1)

Sebastian Lenartowicz
Sebastian Lenartowicz

Reputation: 4864

As has been said in the comments, this is not a good idea. Mathematical expressions, even simple ones, are not a regular language - they are a context-free grammar. While it is hypothetically possible to match an arbitrary math expression with modern regex engines, to attempt to parse one with them is foolhardy at best. I would recommend either rolling your own lexical analyzer or using something like ANTLR.

Upvotes: 2

Related Questions