Goose
Goose

Reputation: 371

java regexp for nested parenthesis

Consider the following java String:

String input = "a, b, (c, d), e, f, (g, (h, i))";

Can you help me to find a java regexp to obtain its 6 parts:

a
b
(c,d)
e
f
(g, (h,i))

That were obtained from the original input string based in the "most external" commas.

Upvotes: 0

Views: 148

Answers (1)

Pshemo
Pshemo

Reputation: 124285

Don't try to use regex for this kind of task in Java since here regex doesn't support recursion and you could end up with monster regex like shown in Is it possible to match nested brackets with a regex without using recursion or balancing groups?.

Simplest solution would be writing your own parser which would count balance of ( and ) (lets call it nesting level) and will split only on , if nesting level would be 0.

Code for this task (which will also solve this problem in one iteration) could look like

public static List<String> splitOnNotNestedCommas(String data){
    List<String> resultList = new ArrayList();
    
    StringBuilder sb = new StringBuilder();
    int nestingLvl = 0;
    for (char ch : data.toCharArray()){
        if (ch == '(') nestingLvl++;
        if (ch == ')') nestingLvl--;
        if (ch == ',' & nestingLvl==0){
            resultList.add(sb.toString().trim());
            sb.delete(0, sb.length());
        }else{
            sb.append(ch);
        }
    }
    if (sb.length()>0)
        resultList.add(sb.toString().trim());
    
    return resultList;
}

Usage:

for (String s : splitOnNotNestedCommas("a, b, (c, d), e, f, (g, (h, i))")){
    System.out.println(s);
}

Output:

a
b
(c, d)
e
f
(g, (h, i))

Upvotes: 3

Related Questions