juntao
juntao

Reputation: 13

Extract key/val pair, value can span across lines

input file :

key1=1
key2=start(a
b
c=
d)end
key3=d=e=f
somekey=start(123)end
morekey=start(1
2)end
key=jj

output

key1    -> 1
key2    -> a
           b
           c=
           d
key3    -> d=e=f
somekey -> 123
morekey -> 1
           2
key     -> jj

Request : Trying in java. Can't use java.util.Properties, regex is fine but not preferred, prefer StringUtils.substringBetween, but regex will do. How can I traverse through multiple lines and preserve newlines too. The following obviously dont work for multilines. Was going to try regex, but only if a more elegant way is not possible.

    String[] str = line.split("=", 2);
    StringUtils.substringBetween(line,startString,endString)); 

Upvotes: 1

Views: 128

Answers (3)

janos
janos

Reputation: 124666

One way to solve this is to write your own parser. For example:

public static final String START = "start(";
public static final String END = ")end";

// ...

Scanner scanner = new Scanner(
        "key1=1\n" +
        "key2=start(a\n" +
        "b\n" +
        "c=\n" +
        "d)end\n" +
        "key3=d=e=f\n" +
        "somekey=start(123)end\n" +
        "morekey=start(1\n" +
        "2)end\n" +
        "key=jj");

Map<String, String> map = new HashMap<>();
while (scanner.hasNext()) {
    String line = scanner.nextLine();
    int eq = line.indexOf('=');
    String key = line.substring(0, eq);
    String value = line.substring(eq + 1);
    if (value.startsWith(START)) {
        StringBuilder sb = new StringBuilder(value.substring(START.length()));
        while (!value.endsWith(END)) {
            value = scanner.nextLine();
            sb.append('\n').append(value);
        }
        value = sb.substring(0, sb.length() - END.length());
    }
    map.put(key, value);
}

for (Map.Entry<String, String> entry : map.entrySet()) {
    System.out.printf("%s -> %s\n", entry.getKey(), entry.getValue());
}

Upvotes: 0

Andreas
Andreas

Reputation: 159114

The following regex can find all your key/value pairs:

(?ms)^(\w+)=(?:start\((.*?)\)end|(.*?))$

The key will be in capture group 1, and the value will be in capture group 2 or 3.

Test

String input = "key1=1\r\n" +
               "key2=start(a\r\n" +
               "b\r\n" +
               "c=\r\n" +
               "d)end\r\n" +
               "key3=d=e=f\r\n" +
               "somekey=start(123)end\r\n" +
               "morekey=start(1\r\n" +
               "2)end\r\n" +
               "key=jj\r\n";

String regex = "(?ms)^(\\w+)=(?:start\\((.*?)\\)end|(.*?))$";

Map<String, String> map = new HashMap<>();
for (Matcher m = Pattern.compile(regex).matcher(input); m.find(); )
    map.put(m.group(1), (m.start(2) != -1 ? m.group(2) : m.group(3)));

for (Entry<String, String> e : map.entrySet())
    System.out.printf("%-7s -> %s%n", e.getKey(),
                      e.getValue().replaceAll("(\\R)", "$1           "));

Output

key1    -> 1
key2    -> a
           b
           c=
           d
key3    -> d=e=f
somekey -> 123
morekey -> 1
           2
key     -> jj

Upvotes: 0

Youcef LAIDANI
Youcef LAIDANI

Reputation: 59986

did you mean something like this :

String str = "key1=1\n"
        + "key2=start(a\n"
        + "b\n"
        + "c=\n"
        + "d)end\n"
        + "key3=d=e=f\n"
        + "somekey=start(123)end\n"
        + "morekey=start(1\n"
        + "2)end\n"
        + "key=jj";
System.out.println(str.replaceAll("start\\(|\\)end", "")
        .replaceAll("(\\w{2})=", "$1\t-> ")
        .replaceAll("(\n\\w)", "\t$1"));

Upvotes: 1

Related Questions