datacruncher
datacruncher

Reputation: 165

Get the key values using jq from json

I am looking for a way to find the full key path for given value taken from the variable. My input comes from the elasticsearch query result.

For example I want a full path to the key value: 9i6O4ERWWB They key value is always unique and what only changes is the example.com and template1 keys (I cannot predict what will be the name).

Once knowing the key path: _source.example.com.template1 I want to increment the "counter" field and update the elasticsearch document.

My input JSON:

{
    "_index": "domains",
    "_type": "doc",
    "_id": "c66443eb1e6a0850b03a91fdb967f4d1",
    "_score": 2.4877305,
    "_source": {
        "user_id": "c66443eb1e6a0850b03a91fdb967f4d1",
        "statistics": {
            "test_count": 0,
            "datasize": 0,
            "example.com": {
                "template1": {
                    "image_id": "iPpDWbaO3YTIEb0pBkW3.png",
                    "link_id": "4ybOOUJpaBpDaLxPkz1j.html",
                    "counter": 0,
                    "subdomain_id": "9i6O4ERWWB"
                },
                "template2": {
                    "image_id": "iPpDWasdas322sdaW3.png",
                    "link_id": "4ybOOd3425sdfsz1j.html",
                    "counter": 1,
                    "subdomain_id": "432432sdxWWB"
                }
            },
            "example1.com": {
                "template1": {
                    "image_id": "iPpDWdasdasdasdas3.png",
                    "link_id": "4ybOOUadsasdadsasd1j.html",
                    "subdomain_id": "9i6O4ERWWB"
                }
            }
        }
    }
}

What I have tried was:

<myfile jq -c 'paths | select(.[-1])
<myfile jq -c 'paths | select(.[-1] == "subdomain_id")'

but this prints all apart the key values:

["_index"]
["_type"]
["_id"]
["_score"]
["_source"]
["_source","user_id"]
["_source","statistics"]
["_source","statistics","test_count"]
["_source","statistics","datasize"]
["_source","statistics","example.com"]
["_source","statistics","example.com","template1"]
["_source","statistics","example.com","template1","image_id"]
["_source","statistics","example.com","template1","link_id"]
["_source","statistics","example.com","template1","subdomain_id"]
["_source","statistics","template2"]
["_source","statistics","template2","image_id"]
["_source","statistics","template2","link_id"]
["_source","statistics","template2","subdomain_id"]
["_source","statistics","example1.com"]
["_source","statistics","example1.com","template1"]
["_source","statistics","example1.com","template1","image_id"]
["_source","statistics","example1.com","template1","link_id"]
["_source","statistics","example1.com","template1","subdomain_id"]

My pseudocode I am trying to write:

seeked_key_value="432432sdxWWB"
jq -n --arg seeked_key_value "$seeked_key_value" \
'paths | select(.[-1].$seeked_key_value'

Expected result: ["_source","statistics","example.com","template1","subdomain_id":"432432sdxWWB"]

Is this doable with jq in bash?

Upvotes: 2

Views: 397

Answers (2)

peak
peak

Reputation: 116870

It's best to avoid grep in cases like this. To meet the exact requirements in the present case, one could write:

jq -c 'paths(scalars) as $p
| [$p, getpath($p)]
| select(.[1] == "9i6O4ERWWB")' input.json

If one really needs grep-like functionality, you can always use jq's test/1.

Upvotes: 1

Maciej
Maciej

Reputation: 1994

You can 'extract' paths using the following:

jq -c 'paths(scalars) as $p | [$p, getpath($p)]' file.json | grep 432432sdxWWB

and response is:

[["_source","statistics","example.com","template2","subdomain_id"],"432432sdxWWB"]

Possibly you can improve jq query to get only single value but I hope it helps you in determining final version :)

Upvotes: 0

Related Questions