MaYaN
MaYaN

Reputation: 6996

How can I match the given pattern using Regex in C#?

I have the following input:

-key1:"val1" -key2: "val2" -key3:(val3) -key4: "(val4)" -key5: val5 -key6: "val-6" -key-7: val7 -key-eight: "val 8"

With only the following assumption about the pattern:

How can I match and extract each key and it's corresponding value?

I have so far come up with the following regex:

-(?<key>\S*):\s?(?<val>\S*)

But it's currently not matching the complete value for the last argument as it contains a space but I cannot figure out how to match it.

The expected output should be:

Any help is much appreciated.

Upvotes: 2

Views: 84

Answers (5)

Chinh Nguyen
Chinh Nguyen

Reputation: 653

This should do the trick

-(?<key>\S*):\s*(?<value>(?(?=")((")(?:(?=(\\?))\2.)*?\1))(\S*))

a sample link can be found here. Basically it does and if/else/then to detect if the value contain " as (?(?=")(true regex)(false regex), the false regex is yours \S* while the true regex will try to match start/end quote (")(?:(?=(\\?))\2.)*?\1).

Upvotes: 0

JasperMoneyshot
JasperMoneyshot

Reputation: 357

I presume you're wanting to keep the brackets and quotation marks as that's what you're doing in the example you gave? If so then the following should work:

-(?<key>\S+):+\s?(?<val>\S+\s?\d+\)?\"?)

This does presume that all val's end with a number though.

EDIT: Given that the val doesn't always end with a number, but I'm guessing it always starts with val, this is what I have:

-(?<key>\S+):+\s?(?<val>\"?\(?(val)+\s?\S+)

Seems to be working properly...

Upvotes: 1

NeverHopeless
NeverHopeless

Reputation: 11233

Try this regex using Replace function:

(?:^|(?!\S)\s*)-|\s*:\s*

and replace with "\n". You should get key values in separate lines.

Upvotes: 1

V0ldek
V0ldek

Reputation: 10563

Guessing that you want to only allow whitespace characters that are not at the beginning or end, change your regex to:

-(?<key>\S*):\s?(?<val>\S+(\s*[^-\s])*)

This assumes that the character - preceeded by a whitespace unquestioningly means a new key is beginning, it cannot be a part of any value.

For this example:

-key: value -key2: value with whitespace -key3: value-with-hyphens -key4: v

The matches are: -key: value, -key2: value with whitespace, -key3: value-with-hyphens, -key4: v.

It also works perfectly well on your provided example.

Upvotes: 4

TheGeneral
TheGeneral

Reputation: 81493

A low tech (non regex) solution, just for an alternative. Trim guff, ToDictionary if you need

var results = input.Split(new[] { " -" }, StringSplitOptions.RemoveEmptyEntries)
                   .Select(x => x.Trim('-').Split(':'));

Full Demo Here

Output

key1 -> "val1"
key2 ->  "val2"
key3 -> (val3)
key4 ->  "(val4)"
key5 ->  val5
key6 ->  "val-6"
key-7 ->  val7
key8 ->  "val 8"

Upvotes: 1

Related Questions