Java splitting nested brackets string

Question

I have a string like:

Fields  { name:"aa" type: "bb" paramA { name:"cc" } paramB { other:"ee" other_p:"ff"} paramC { name: "bb" param: "dd" other_params { abc: "xx" xyz:"yy"}} }

My regex code in Java extract all that is between brackets for paramA, paramB and other_params. I need somehow to structure this in a Java object, but I am stucked at paramC extraction.

Pattern pattern=Pattern.compile("\w+\s(\{([^{]*?)\})");
Matcher matcher=pattern.matcher(theAboveString);
while (matcher.find()){
System.out.println(matcher.group(1);
}

My code for the extraction

Andreas · Accepted Answer

Here's an example of parsing using regex:

String input = "Fields  { name:"aa" type: "bb" paramA { name:"cc" } paramB { other:"ee" other_p:"ff"} paramC { name: "bb" param: "dd" other_params { abc: "xx" xyz:"yy"}} }";
Matcher m = Pattern.compile("\s*(?:(\w+)\s*(?::\s*(".*?")|\{)|\})\s*").matcher(input);
int start = 0;
Deque stack = new ArrayDeque<>();
while (m.find()) {
    if (m.start() != start)
        throw new IllegalArgumentException("Invalid data at " + start);
    if (m.group(2) != null) {
        System.out.println(stack + " : " + m.group(1) + " = " + m.group(2));
    } else if (m.group(1) != null) {
        //System.out.println(m.group(1) + " {");
        stack.addLast(m.group(1));
    } else {
        //System.out.println("}");
        if (stack.isEmpty())
            throw new IllegalArgumentException("Unbalanced brace at " + start);
        stack.removeLast();
    }
    start = m.end();
}
if (start != input.length())
    throw new IllegalArgumentException("Invalid data at " + start);
if (! stack.isEmpty())
    throw new IllegalArgumentException("Unexpected end of text");

Output

[Fields] : name = "aa"
[Fields] : type = "bb"
[Fields, paramA] : name = "cc"
[Fields, paramB] : other = "ee"
[Fields, paramB] : other_p = "ff"
[Fields, paramC] : name = "bb"
[Fields, paramC] : param = "dd"
[Fields, paramC, other_params] : abc = "xx"
[Fields, paramC, other_params] : xyz = "yy"

You should be able to take it from here.

UPDATE

To also support numeric values, use this regex:

"\s*(?:(\w+)\s*(?::\s*(".*?"|[-+0-9.eE]+)|\{)|\})\s*"

Testing with "Layer { name: "conv2" type: "Convolution" bottom: "norm1" top: "conv2" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 256 pad: 2 kernel_size: 5 group: 2 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 1 } }}" produces:

[Layer] : name = "conv2"
[Layer] : type = "Convolution"
[Layer] : bottom = "norm1"
[Layer] : top = "conv2"
[Layer, param] : lr_mult = 1
[Layer, param] : decay_mult = 1
[Layer, param] : lr_mult = 2
[Layer, param] : decay_mult = 0
[Layer, convolution_param] : num_output = 256
[Layer, convolution_param] : pad = 2
[Layer, convolution_param] : kernel_size = 5
[Layer, convolution_param] : group = 2
[Layer, convolution_param, weight_filler] : type = "gaussian"
[Layer, convolution_param, weight_filler] : std = 0.01
[Layer, convolution_param, bias_filler] : type = "constant"
[Layer, convolution_param, bias_filler] : value = 1

Java splitting nested brackets string

Answers (2)

Related Questions