Michael Osofsky
Michael Osofsky

Reputation: 13205

How to recurse with jq on nested JSON where each object has a name property?

I have a nested JSON object where each level has the same property key and what distinguishes each level is a property called name. If I want to traverse down to a level which has a particular "path" of name properties, how would I formulate the jq filter?

Here is some sample JSON data that represents a file system's directory structure:

{
  "subs": [
    {
      "name": "aaa",
      "subs": [
        {
          "name": "bbb",
          "subs": [
            {
              "name": "ccc",
              "subs": [
                {
                  "name": "ddd",
                  "payload": "xyz"
                }
              ]
            }
          ]
        }
      ]
    }
  ]
}

What's a jq filter for obtaining the value of the payload in the "path" aaa/bbb/ccc/ddd?

Prior research:

  1. jq - select objects with given key name - helpful but looks for any element in the JSON which contains the specified name whereas I'm looking for an element that's nested under a set of objects who also have specific names.

  2. http://arjanvandergaag.nl/blog/wrestling-json-with-jq.html - helpful in section 4 where it shows how to extract an object having a property name having a particular value. However, the recursion performed is based a specific known set of property names ("values[].links.clone[]"). In my case, my equivalent is just "subs[].subs[].subs[]".

Upvotes: 2

Views: 4055

Answers (2)

peak
peak

Reputation: 117067

Here is the basis for a generic solution:

def descend(name): .subs[] | select(.name == name);

So your particular query could be formulated as follows:

descend( "aaa") | descend( "bbb") | descend( "ccc") | descend( "ddd") | .payload

Or slightly better, still using the above definition of descend:

def path(array): 
  if (array|length)==0 then . 
  else descend(array[0]) | path(array[1:])
  end;

path( ["aaa", "bbb", "ccc", "ddd"] ) | .payload

TCO

The above recursive definition of path/1 is simple enough but would be unsuitable for very deeply nested data structures, e.g. if the depth is greater than 1000. Here is an alternative definition that takes advantage of jq's tail-call optimization, and that therefore runs very quickly:

def atpath(array):
  [array, .] 
  |  until( .[0] == []; .[0] as $a | .[1] | descend($a[0]) | [$a[1:], . ] )
  | .[1];

.aaa.bbb.ccc.ddd

If you want to be able to use the .aaa.bbb.ccc.ddd notation, one approach would be to begin by "flattening" the data:

def flat:
  { (.name): (if .subs then (.subs[] | flat) else .payload end) };

Since the top-level element does not have a "name" tag, the query would then be:

.subs[] | flat | .aaa.bbb.ccc.ddd

Here is a more efficient approach, once again using descend defined above:

def payload(p):
  def get($array):
    if $array == []
    then .payload
    else descend($array[0]) | get($array[1:]) end;
  get( null | path(p) );

payload( .aaa.bbb.ccc.ddd )

Upvotes: 3

Michael Osofsky
Michael Osofsky

Reputation: 13205

The filter in the following jq command recurses down a "path" of objects that have name properties which correspond to the "path" aaa/bbb/ccc/ddd:

jq '.subs[] | select(.name = "aaa") | .subs[] | select(.name = "bbb") | .subs[] | select(.name = "ccc") | .subs[] | .payload'

Here it is live on qplay.org:

https://jqplay.org/s/tblW7UX0Si

Upvotes: 2

Related Questions