user3599444
user3599444

Reputation: 145

Regex to match keys in json

I am trying to match keys in JSON of this type:

define({
  key1: "some text: and more",
  key2 : 'some text ',
  key3: ": more some text",
  key4: 'some text:'
});

with this regexp (?<=\s|{|,)\s*(\w+)\s*:\s?[\"|\']/g. But currently it's matching the last text: also that should be ignore.

An example could be seen here

Could you give me hint how to fix this regex so it matches only keys.

Upvotes: 7

Views: 21195

Answers (3)

Stephan
Stephan

Reputation: 43013

Try this regular expression:

text is matched initially because it is considered as a key.

(\w+)\s*:\s*(["']).+\2,?

Demo

Upvotes: 2

Mario
Mario

Reputation: 36487

I wouldn't suggest parsing JSON using regular expressions. There are small libraries for that, some even header only and with very convenient licensing terms (like rapidjson, which I'm using rightn ow).

But if you really want to, the following expression should find your key/value pairs (note that I'm using Perl, mostly for nice syntax highlighting):

(\w+)\s*:\s*('[^']*'|"[^"]*"|[+\-]?\d+(?:.\d+)?)
  • Keep in mind that this won't work properly with escaped quotes inside your values or not properly enclosed strings.
  • (\w+) will match the full key.
  • \s* matches any or no sequence of space characters.
  • : is really just a direct match.
  • '[^']*' will match any characters enclosed by ' (same for the second part of that bracket).
  • [+\-]?\d+(?:.\d+)? will match any number (with or without decimals).

Edit: Since others provided nice and easy to see online demos, here's mine.

Upvotes: 3

zx81
zx81

Reputation: 41838

How about this shorter regex:

(?m)^[ ]*([^\r\n:]+?)\s*:

In the demo, look at the Group 1 captures in the right pane.

  • (?m) allows the ^ to match at the beginning of each line
  • ^ asserts that we are positioned at the beginning of the line
  • [ ]* eats up all the space characters
  • ([^\r\n:]+?) lazily matches all characters that are colons : or newlines, and capture them to Group 1 (this is what we want), up to...
  • \s*: matches optional whitespace characters and a colon

Upvotes: 10

Related Questions