chrisst
chrisst

Reputation: 1766

How to use jq to find all paths to a certain key

In a very large nested json structure I'm trying to find all of the paths that end in a key.

ex:

{
  "A": {
    "A1": {
      "foo": {
        "_": "_"
      }
    },
    "A2": {
      "_": "_"
    }
  },
  "B": {
    "B1": {}
  },
  "foo": {
    "_": "_"
  }
}

would print something along the lines of: ["A","A1","foo"], ["foo"]


Unfortunately I don't know at what level of nesting the keys will appear, so I haven't been able to figure it out with a simple select. I've gotten close with jq '[paths] | .[] | select(contains(["foo"]))', but the output contains all the permutations of any tree that contains foo. output: ["A", "A1", "foo"]["A", "A1", "foo", "_"]["foo"][ "foo", "_"]

Bonus points if I could keep the original data structure format but simply filter out all paths that don't contain the key (in this case the sub trees under "foo" wouldn't need to be hidden).

Upvotes: 26

Views: 21628

Answers (2)

Oliver I
Oliver I

Reputation: 456

I had the same fundamental problem.

With (yaml) input like:

developer:
  android:
    members:
    - alice
    - bob
    oncall:
    - bob
hr:
  members:
  - charlie
  - doug
this:
  is:
    really:
      deep:
        nesting:
          members:
          - example deep nesting

I wanted to find all arbitrarily nested groups and get their members.

Using this:

yq . | # convert yaml to json using python-yq
    jq ' 
    . as $input | # Save the input for later
    . | paths | # Get the list of paths 
        select(.[-1] | tostring | test("^(members|oncall|priv)$"; "ix")) | # Only find paths which end with members, oncall, and priv
        . as $path | # save each path in the $path variable
    ( $input | getpath($path) ) as $members | # Get the value of each path from the original input
    {
        "key": ( $path | join("-") ), # The key is the join of all path keys
        "value": $members  # The value is the list of members
    }
    ' |
    jq -s 'from_entries' | # collect kv pairs into a full object using slurp
    yq --sort-keys -y . # Convert back to yaml using python-yq

I get output like this:

developer-android-members:
  - alice
  - bob
developer-android-oncall:
  - bob
hr-members:
  - charlie
  - doug
this-is-really-deep-nesting-members:
  - example deep nesting

Upvotes: 2

peak
peak

Reputation: 116640

With your input:

$ jq -c 'paths | select(.[-1] == "foo")' 
["A","A1","foo"]
["foo"]

Bonus points:

(1) If your jq has tostream:

$ jq 'fromstream(tostream| select(.[0]|index("foo")))'

Or better yet, since your input is large, you can use the streaming parser (jq -n --stream) with this filter:

fromstream( inputs|select( (.[0]|index("foo"))))

(2) Whether or not your jq has tostream:

. as $in
| reduce (paths(scalars) | select(index("foo"))) as $p
    (null; setpath($p; $in|getpath($p)))

In all three cases, the output is:

{
  "A": {
    "A1": {
      "foo": {
        "_": "_"
      }
    }
  },
  "foo": {
    "_": "_"
  }
}

Upvotes: 46

Related Questions