Vencovsky
Vencovsky

Reputation: 31625

Parsing array syntax using regex in javascript

I found the answer for this here but it's in php.

I would like to match an array like [123, "hehe", "lala"] but only if the array syntax is correct.

I made this regex /\["?.+"?(?:,"?.+"?)*\]/.

The problem is that if the input is [123, "hehe, "lala"], the regex match, but the syntax is incorrect.

How can I make it only match if the array syntax is correct?

My problem is making the second " required when the first "is matched.

Edit: I'm only trying to do it only with strings and numbers inside the array.

Upvotes: 0

Views: 767

Answers (2)

Alexis Wilke
Alexis Wilke

Reputation: 20731

You must have two (or more) separate expressions (using the | operator) in order to do that.

So it would be something like this:

/\[\s*("[^"]*"|[0-9]+)(\s*,\s*("[^"]*"|[0-9]+))*\s*\]/

(You may also want to use ^ at the start and $ at the end to make sure nothing else appears before/after the array: /^...snip...$/ to match the string from start to finish.)

If you need floating point numbers with exponents, add a period and the 'e' character: [0-9.eE]+ (which is why I did not use \d+ because only digits are allowed in that case.) To make sure a number is valid, it's much more complicated, obviously (sign, exponent with/without sign, digits only before or after the decimal point...)

You could also support single quoted strings. That too is a separate expression: '[^']*'.

You may want to allow spaces before and after the brackets too (start: /^\s*\[... and end: ...\]\s*$/).

Finally, if you want to really support JavaScript strings you would need to add support for the backslash. Something like this: ("([^"]|\\.)*").

Note

Your .+ expression would match " and , too and without the ^ and $ an array as follow matches your expression just fine:

This Array ["test", 123, true, "this"] Here

Upvotes: 1

Alex G
Alex G

Reputation: 1917

You can try this regex: /\[((\d+|"([^"]|\\")*?")\s*,?\s*)*(?<!,)\]/

Each item should either

"([^"]|\\")*?": start and end with ", containing anything but ". If " is contained it should be escaped (\").

\d+: a number

After each item should be

\s*,?\s*: a comma with any number of spaces before or after.

And before the closing bracket should not be a comma: (?<!,)

Demo: https://regex101.com/r/jRAQUc/1

Upvotes: 2

Related Questions