Reputation: 53

Recursive replace with Java regular expression?

I can replace ABC(10,5) with (10)%(5) using:

replaceAll("ABC\\(([^,]*)\\,([^,]*)\\)", "($1)%($2)")

but I'm unable to figure out how to do it for ABC(ABC(20,2),5) or ABC(ABC(30,2),3+2).

If I'm able to convert to ((20)%(2))%5 how can I convert back to ABC(ABC(20,2),5)?

Thanks, j

Upvotes: 5

Answers (4)

flori10

Reputation: 522

You could use this Regular Expressions library https://github.com/florianingerl/com.florianingerl.util.regex , that also supports Recursive Regular Expressions.

Converting ABC(ABC(20,2),5) to ((20)%(2))%(5) looks like this:

    Pattern pattern = Pattern.compile("(?<abc>ABC\\((?<arg1>(?:(?'abc')|[^,])+)\\,(?<arg2>(?:(?'abc')|[^)])+)\\))");
    Matcher matcher = pattern.matcher("ABC(ABC(20,2),5)");
    String replacement = matcher.replaceAll(new DefaultCaptureReplacer() {
        @Override
        public String replace(CaptureTreeNode node) {
            if ("abc".equals(node.getGroupName())) {
                return "(" + replace(node.getChildren().get(0)) + ")%(" + replace(node.getChildren().get(1)) + ")";
            } else
                return super.replace(node);
        }

    });
    System.out.println(replacement);
    assertEquals("((20)%(2))%(5)", replacement);

Converting back again, i.e. from ((20)%(2))%(5) to ABC(ABC(20,2),5) looks like this:

    Pattern pattern = Pattern.compile("(?<fraction>(?<arg>\\(((?:(?'fraction')|[^)])+)\\))%(?'arg'))");
    Matcher matcher = pattern.matcher("((20)%(2))%(5)");
    String replacement = matcher.replaceAll(new DefaultCaptureReplacer() {
        @Override
        public String replace(CaptureTreeNode node) {
            if ("fraction".equals(node.getGroupName())) {
                return "ABC(" + replace(node.getChildren().get(0)) + "," + replace(node.getChildren().get(1)) + ")";
            } else if ("arg".equals(node.getGroupName())) {
                return replace(node.getChildren().get(0));
            } else
                return super.replace(node);
        }

    });
    System.out.println(replacement);
    assertEquals("ABC(ABC(20,2),5)", replacement);

Upvotes: 1

Boris Strandjev

Reputation: 46963

I am going to answer about the first question. I was not able to do the task in a single replaceAll. I don't think it is even achievable. However if I use loop then this should do the work for you:

    String termString = "([0-9+\\-*/()%]*)";
    String pattern = "ABC\\(" + termString + "\\," + termString + "\\)";
    String [] strings = {"ABC(10,5)", "ABC(ABC(20,2),5)", "ABC(ABC(30,2),3+2)"};
    for (String str : strings) {
        while (true) {
            String replaced = str.replaceAll(pattern, "($1)%($2)");
            if (replaced.equals(str)) {
                break;
            }
            str = replaced;
        }
        System.out.println(str);
    }

I am assuming you are writing parser for numeric expressions, thus the definition of term termString = "([0-9+\\-*/()%]*)". It outputs this:

(10)%(5)
((20)%(2))%(5)
((30)%(2))%(3+2)

EDIT As per the OP request I add the code for decoding the strings. It is a bit more hacky than the forward scenario:

    String [] encoded = {"(10)%(5)", "((20)%(2))%(5)", "((30)%(2))%(3+2)"};
    String decodeTerm = "([0-9+\\-*ABC\\[\\],]*)";
    String decodePattern = "\\(" + decodeTerm + "\\)%\\(" + decodeTerm + "\\)";
    for (String str : encoded) {
        while (true) {
            String replaced = str.replaceAll(decodePattern, "ABC[$1,$2]");
            if (replaced.equals(str)) {
                break;
            }
            str = replaced;
        }
        str = str.replaceAll("\\[", "(");
        str = str.replaceAll("\\]", ")");
        System.out.println(str);
    }

And the output is:

ABC(10,5)
ABC(ABC(20,2),5)
ABC(ABC(30,2),3+2)

Upvotes: 1

Joop Eggen

Reputation: 109613

You can start evaluating the inner most reducable expressions first, till no more redux exists. However you have to take care of other ,, ( and ). The solution of @BorisStrandjev is better, more bullet proof.

String infix(String expr) {
    // Use place holders for '(' and ')' to use regex [^,()].
    expr = expr.replaceAll("(?!ABC)\\(", "<<");
    expr = expr.replaceAll("(?!ABC)\\)", ">>");
    for (;;) {
        String expr2 = expr.replaceAll("ABC\\(([^,()]*)\\,([^,()]*)\\)",
                "<<$1>>%<<$2>>");
        if (expr2 == expr)
            break;
        expr = expr2;
    }
    expr = expr.replaceAll("<<", ")");
    expr = expr.replaceAll(">>", ")");
    return expr;
}

Upvotes: 1

lucian.pantelimon

Reputation: 3669

You can try to rewrite the string using the Polish notation and then replace any % X Y with ABC(X,Y).

Here's the wiki link for the Polish notation.

The problem is that you need to find out which rewrite of ABC(X,Y) occurred first when you recursively replaced them in your string. The Polish notation is useful for "deciphering" the order that these rewrites occur and is widely used in expression evaluation.

You can do this by using a stack and recording which replace occurred first: find the inner-most set of parentheses, push only that expression onto the stack, then remove that from your string. When you want to reconstruct the expression original expression, just start at the top of the stack and apply the reverse transformation (X)%(Y) -> ABC(X,Y).

This is somewhat a form of the Polish notation, with the only difference being that you don't store the entire expression as a string, but rather store it in a stack for easier processing.

In short, when replacing, start with the inner-most terms (the ones that have no parentheses in them) and apply the reverse replace.

It may be helpful to use (X)%(Y) -> ABC{X,Y} as an intermediary rewrite rule, then rewrite the curly brackets as round brackets. This way it will be easier to determine which is the inner-most term, as the new terms won't use round brackets. Also it is easier to implement, but not as elegant.

Upvotes: 0

Recursive replace with Java regular expression?

Answers (4)

Related Questions