Stirls
Stirls

Reputation: 97

Regular expression to process key value pairs

I am attempting to write a regular expression to process a string of key value(s) pairs formatted like so

KEY/VALUE KEY/VALUE VALUE KEY/VALUE

A key can have multiple values separated by a space.

I want to match a keys values together, so the result on the above string would be

VALUE
VALUE VALUE
VALUE

I currently have the following as my regex

[A-Z0-9]+/([A-Z0-9 ]+)(?:(?!^[A-Z0-9]+/))

but this returns

VALUE KEY 

as the first result.

Upvotes: 1

Views: 1063

Answers (3)

Doug Moscrop
Doug Moscrop

Reputation: 4544

Why a RegEx?

String input = "key/value key/value1 value2 key/value";

String[] pairs = input.split("/");

for(int i = 0; i < pairs.length; i += 2) {
    String key = pairs[0];
    String value = pairs[1];
    /* (Optionally)
        String[] values = value.split(" ");
    */
}

However, if you insist, then I think this:

([\w]+)/([\w ]+)(?![\w]*/)

Is a good choice, it lets you find and capture the key group and values as a separate group. It permits underscores in the key and value names. You can add a hyphen to the set as well if you like.

Credit: ruakh for doing the bulk of the work on the RegEx. I upvoted his/her answer.

Upvotes: 1

Jay Gilford
Jay Gilford

Reputation: 15151

You would be better off using a string split of some kind (not a java programmer but in php there's the explode() function). Split the string based on the space firstly, into an array of key/value items, then loop through the array items, and split them again using the / instead this time

Upvotes: 0

ruakh
ruakh

Reputation: 183301

In your negative lookahead assertion, change + to *; otherwise, you're not preventing the match from ending right before a /, you're only preventing it from ending right before a word that's followed by a /. Also, remove the ^ from your negative lookahead assertion; it means "beginning of string", so will never match in this context. That leaves:

[A-Z0-9]+/([A-Z0-9 ]+)(?![A-Z0-9]*/)

(I also dropped the (?:...) notation, since it had no effect in the context where it appeared.)

That said, a somewhat easier-to-read approach might be this:

[A-Z0-9]+/([A-Z0-9 ]+)( |$)

which requires the value to be followed by either a space (which gets swallowed) or end-of-string. Since keys are followed by /, it will ignore them.

Upvotes: 2

Related Questions