Drizzy92
Drizzy92

Reputation: 21

String regex Parsing with Semicolons in data

How can I put together a regex to split a fiql string (example below) which separates conditions with a semicolon. The problem is semi colons can also be in the string.

I am using string split but can't find the right regex. I've tried below in which in trying to get the last semi colon before the ==:

query.split("(;)[^;]*==)

But it only works for the first key value.

Example string:

Key1==value1; key2==val;ue2;key3==value3

Target is array or list : key1==value1, key2==val;ue2, key3==value3 Problem here is the semicolon in value 2 is causing a split.

Any idea?

Upvotes: 2

Views: 1084

Answers (3)

bashnesnos
bashnesnos

Reputation: 816

Use a group instead. And search tokens using java.util.regex.Matcher in a loop:

Pattern patrn = Pattern.compile("(?>(\\w+==[\\w;]+)(?:;\\s*|$))");
Matcher mtchr = patrn.matcher("Key1==value1; key2==val;ue2;key3==value3");


while(mtchr.find()) {
    System.out.println(mtchr.group(1));
}

Yields:
Key1==value1
key2==val;ue2
key3==value3

Adding ;? won't work unfortunately, since your middle tokens won't terminate anymore.

Upvotes: 1

Osama Dwairi
Osama Dwairi

Reputation: 13

RegExp are evil.

if you can request to make minimal change on the string to be parsed, so value be surrounded by double qoutes, then, the string can be like Key1=="value1"; key2=="val;ue2";key3=="value3" then this post will help you check Java: splitting a comma-separated string but ignoring commas in quotes

alternatively, you need to write a custom String parser. here is a quick non-optimized CustomStringParser

Hope this helps.

Upvotes: 0

Pshemo
Pshemo

Reputation: 124275

It looks like you want to split on ; only if it has == after it, but also has no ; between it and that ==.

You ware almost there. Your code should look like

split(";(?=[^;]*==)")

notice that (?=...) part is positive look-ahead, which simply checks if after ; exists part which can be matched by subexpression [^;]*==, but doesn't include that part in final match so it won't disappear after splitting (it is zero-length match).

DEMO:

String str = "Key1==value1; key2==val;ue2;key3==value3";
for (String s : str.split(";(?=[^;]*==)")){
    System.out.println(s);
}

Output:

Key1==value1
 key2==val;ue2
key3==value3

If you want to also get rid of space before key2 then make it part of delimiter on which you want to split. So let regex match not only ; but also whitespaces surrounding it. Zero or more whitespaces can be represented with \s* so your code can look like

split("\\s*;\\s*(?=[^;]*==)")

Upvotes: 1

Related Questions