Explosion Pills
Explosion Pills

Reputation: 191729

Getting only desired properties from nested array values with jq

The structure I ultimately want would be:

{
  "catalog": [
    {
      "name": "X",
      "catalog": [
        { "name": "Y", "uniqueId": "Z" },
        { "name": "Q", "uniqueId": "B" }
      ]
    }
  ]
}

This is what the existing structure looks like except there are many other properties at each level (https://gist.github.com/ajcrites/e0e0ca4ca3a08ff2dc401ec872e6094c). I just want to filter those out and get a JSON format that looks specifically like this.

I have started out with: jq '.catalog', but this returns only the array. I still want the catalog property name there. I can do this with jq '{catalog: .catalog[]}, but this prints out each catalog object individually which makes the whole output invalid JSON. I still want the properties to be in the array. Is there a way to filter specific property key-values within arrays using jq?

Upvotes: 0

Views: 2046

Answers (4)

Jeff Mercado
Jeff Mercado

Reputation: 134811

You could build up a file that contains paths into the json (expressed as arrays) that you want to keep. Then filter out values that do not fit in those paths.

paths.json:

["catalog","name"]
["catalog","catalog","name"]
["catalog","catalog","uniqueId"]

Then filter values based on their paths. Using streams is a great way to go for this since it gives you access to these paths directly:

$ jq --slurpfile paths paths.json '
def keep_path($path): any($paths[]; . == [$path[] | select(strings)]);
fromstream(tostream | select(length == 1 or keep_path(.[0])))
' input.json

Upvotes: 0

peak
peak

Reputation: 116670

If the goal is to remove certain properties, then one could do so using walk/1. For example, to remove properties whose names start with "prop":

 walk(if type == "object" 
      then with_entries(select(.key|startswith("prop") | not)) 
      else . end)

The same approach would also be applicable if the focus is on retaining certain properties, e.g.:

walk(if type == "object" 
     then with_entries(select(.key == "name" or .key == "uniqueId" or .key == "catalog"))
     else . end)

Upvotes: 0

jq170727
jq170727

Reputation: 14625

You could start by using tostream to convert your sample.json into a stream of [path, value] arrays as you can see by running

jq -c tostream sample.json

This will generate

[["catalog",0,"catalog",0,"name"],"Y"]
[["catalog",0,"catalog",0,"prop11"],""]
[["catalog",0,"catalog",0,"uniqueId"],"Z"]
[["catalog",0,"catalog",0,"uniqueId"]]
[["catalog",0,"catalog",1,"name"],"Y"]
[["catalog",0,"catalog",1,"prop11"],""]
...

reduce and setpath can be used to convert back into the original form with a filter such as:

reduce (tostream|select(length==2)) as [$p,$v] (
  {};
  setpath($p;$v)
)

Adding conditionals makes it easy to omit properties at any level. For example the following removes leaf attributes starting with "prop":

reduce (tostream|select(length==2)) as [$p,$v] (
  {};
  if $p[-1]|startswith("prop")
  then .
  else setpath($p;$v)
  end
)

With your sample.json this produces

{
  "catalog": [
    {
      "catalog": [
        {
          "name": "Y",
          "uniqueId": "Z"
        },
        {
          "name": "Y",
          "uniqueId": "Z"
        }
      ],
      "name": "X"
    },
    {
      "catalog": [
        {
          "name": "Y",
          "uniqueId": "Z"
        },
        {
          "name": "Y",
          "uniqueId": "Z"
        }
      ],
      "name": "X"
    }
  ]
}

Upvotes: 0

peak
peak

Reputation: 116670

The following transforms the given input to the desired output and may well be what you want:

{catalog}
| .catalog |= map( {name, catalog} )
| .catalog[].catalog |= map( {name, uniqueId} )
| .catalog |= .[0:1]

However, it's not clear to me that this is really what you want, as you don't discuss the duplication in the given JSON input. So maybe you don't really want the last line in the above, or maybe you want duplicates to be handled in some other way, or ....

Anyway, the trick to keeping things simple here is to use |=.

An alternative approach would be to use del to delete the unwanted properties (rather than selecting the ones you want), but in the present case, that would be (at best) tedious.

Upvotes: 3

Related Questions