noi.m
noi.m

Reputation: 3132

regular expression to tokenize string

I have a serialized object that looks like this (not including inverted commas):

'key1:value1,key2:value2,key3:value3'

It could also look like this:

'key1:value1,key3:value3'

OR

'key1:value1'

OR

'' (it could be empty)

At this point i have this token-izing logic break up this string (which is a tad bit verbose). Is there a single regular expression that can help me extract values for a given key (or return null) given any of the above strings?

Upvotes: 1

Views: 105

Answers (3)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 627022

Keyword matching is straight-forward if you know exact boundaries. In this case, you have single apostrophes as string boundaries and a comma as a separator. So, this is the regex to match a value for a given key (based on your input example):

(?<=key1\:).+?(?=,|'|$) --> finds 3 "value1" matches
(?<=key2\:).+?(?=,|'|$) --> finds 1 "value2" match
(?<=key3\:).+?(?=,|'|$) --> finds 2 "value3" matches
(?<=key4\:).+?(?=,|'|$) --> no match

Upvotes: 0

Todd A. Jacobs
Todd A. Jacobs

Reputation: 84393

Use Ruby String#Split

Regular expression engines vary a lot by language, and since you didn't tag your question with one, I'm giving you a simple Ruby solution. The following will split your string on either a colon or a comma:

'key1:value1,key2:value2,key3:value3'.split /:|,/
#=> ["key1", "value1", "key2", "value2", "key3", "value3"]

Upvotes: 0

Lucas Trzesniewski
Lucas Trzesniewski

Reputation: 51380

I guess all you need is to find key/value pairs:

The simplest regex you can use is:

([^:,]+):([^:,]+)

Demo.

This will match a key in $1 and a value in $2. Simple enough.

Now you could introduce variations if you want to:

(\w+):(.+?)(?=,|$)

Demo.

This one ensures the key only contains alphanumeric characters and underscores, and makes sure the value either ends with a comma or at the end of the string. Hopefully you get the point.

Upvotes: 0

Related Questions