andreyro
andreyro

Reputation: 985

Regex to match key value pairs from JSON

I need to match all the key values pairs from a complex JSON, but only values that are text / String.

For example from:

{"results":[
{"id":16,"name":"some name1","location":"some  location1","parent":true, ,"region":"some region"},
{"id":157,"name":"some name2" , "location":some location2","parent":true}
],"totalCount":170}

I need to match:

"name" 
"some name1"
"location"
"some location1"
"region"
"some region1"
etc

I have this [^:]+\"(?=[,}\s]|$) , but it only matches the values (which are correct).

I need also to match the keys: "name" , "location", "region" (and there can be other key names)

Here is an example for values matched https://regex101.com/r/m8FePZ/6

Upvotes: 0

Views: 1885

Answers (2)

Raymond Choi
Raymond Choi

Reputation: 1271

I saw that you decided not to use regex and used JSON library finally. Here is the simple solution by "Josson & Jossons".

https://github.com/octomix/josson

implementation 'com.octomix.josson:josson:1.3.22'

---------------------------------------------

Josson josson = Josson.fromJsonString(
    "{\n" +
    "    \"results\": [\n" +
    "        {\n" +
    "            \"id\": 16,\n" +
    "            \"name\": \"some name1\",\n" +
    "            \"location\": \"some location1\",\n" +
    "            \"parent\": true,\n" +
    "            \"region\": \"some region\"\n" +
    "        },\n" +
    "        {\n" +
    "            \"id\": 157,\n" +
    "            \"name\": \"some name2\",\n" +
    "            \"location\": \"some location2\",\n" +
    "            \"parent\": true\n" +
    "        }\n" +
    "    ],\n" +
    "    \"totalCount\": 170\n" +
    "}");
JsonNode node = josson.getNode(
        "results.entries().[value.isText()]*.toArray()");
System.out.println(node.toPrettyString());

Output

[ "name", "some name1", "location", "some location1", "region", "some region", "name", "some name2", "location", "some location2" ]

Upvotes: 0

Peter Thoeny
Peter Thoeny

Reputation: 7616

As others pointed out, if you want a robust solution use a JSON parser in your language.

If you want to use regex, and the engine supports lookbehind you can use this:

/("[^"]*"(?=:")|(?<=":)"[^"]*")/g

Explanation:

  • | - or combination of:
    • "[^"]*"(?=:") - quote, 0+ non-quotes, quote, followed by positive lookahead for colon and quote
    • (?<=":)"[^"]*" - positive lookbehind for quote and colon, followed by quote, 0+ non-quotes, quote

If you want to exclude the quotes in the matches, use this regex:

/(?<=")([^"]*(?=":")|(?<=":")[^"]*)/g

Note that these regexes fail for cover corner cases, such as whitespace around keys and values, escaped quotes in values, etc. Hence it is safer to use an actual JSON parser.

Upvotes: 1

Related Questions