SnIpY
SnIpY

Reputation: 662

Iterate over json with jq

I am parsing an API which sends me a JSON response like this:

{
  "newList": {
    "243": {
      "id": "243",
      "name": "test",
      "create": {
        "date": "2017-08-31 13:57:29"
      }
    },
    "244": {
      "id": "244",
      "name": "test",
      "create": {
        "date": "2017-08-31 13:57:29"
      }
    }
 }
}

I am trying to get the name and the create date out of this using bash with jq, so for with little to no success.

jq '.newList' does work and brings me down one level, but that's not sufficient.

jq '.newList .243' gives me a compile error. Furthermore, the 243 is dynamic and can change at any time. What am I doing wrong here?

Upvotes: 13

Views: 19429

Answers (2)

Douglas Winship
Douglas Winship

Reputation: 406

First, the easy part: you would need .newList."243" (with double-quotes) to access the "243" details. jq was interpreting .243 as a number instead of a string, and the object keys are all strings.

But as you say, you don't know the keys ahead of time, so we can't use specific keys. That's okay, there are ways to get them if we need them. And in any case, your original question just asks for the name and create_date, so we can ignore the numeric keys

The simple version

To discard the keys, we can use the values filter, and then we can use map to apply some other filter to the values. So our solution is going to look something like cat input.json | jq ' .newList | values | map(some_filter) '.

What is some_filter? well it's whatever we want it to be, so long as it takes in the right kind of object, and spits out the kind of output that we're looking for. I used to try and write these things as one-liners, but I've found that jq is a lot easier to think about if you use the def keyword to define your own filters (much like you'd define helper-functions in other languages)

My proposed solution:

cat input.json | jq `
      def details_to_string:
          . | "Details: \(id), \(.name), \(.create.date)";

      .newList | values | map(details_to_string)
`

How does this work?

  • Before we start processing the input, we define a new filter called 'details_to_string'. It takes in an object that has an 'id' field, and a `.create.date' field, and uses string interpolation to return a string. (I'm not sure what output format you wanted, but a formatted string is an easy example). Note the colon at the end of the 'def' line, and the semi-colon at the end of the actual filter definition
  • The actual processing begins on the final line. This is where the JSON object from the input file is fed in to the program
  • First, the object goes through a .newList filter, the result is the newList object ( { "243": X, "244": Y } where X and Y are smaller JSON objects
  • Next, this object goes through the values filter, the result is an array of the values ( [ X, Y] where X and Y are the same smaller JSON objects from above)
  • Finally this array passes through map(details_to_string). Map takes in an array and changes every item in the array by passing it through the details_to_string filter. So it takes in an array like [X_details_object, Y_details_object] and outputs and array like [X_string, Y_string]

What's the output?

[
  "Details: 243, test, 2017-08-31 13:57:29",
  "Details: 244, test, 2017-08-31 13:57:29"
]

If you just want the individual strings (no square brackets, no commas) then you can split the array open at the end. ( .newList | values | map(details_to_string) | .[] ). If you also change 'jq' to 'jq -r' then you'll get rid of the quote marks as well

If you only wanted the first item in the list, then you can extract that with .[0]: .newList | values | .[0] | details_to_string (Note that we've now done away with the array, so we don't need map anymore).

The more interesting version

In a comment above, you asked about how to capture and include the keys "243" and "244", and you mentioned that those strings wouldn't be available in a field called "id", like they are in the original example. For this, I think it's best to use a filter called to_entries

to_entries makes the keys and the values of an object available, by turning them both into values. Each key-value pair inside an object becomes something that I think of as an 'entry' object, which looks like { "key": some_key, "value": some_value}. Note that "key" and "value" are literally the strings "key" and "value", so an object like { "243": "alpha", "244": "Beta"} Becomes an array like [ {"key": "243", "value": "alpha"}, {"key": "244", "value": "beta"} ].

This makes the keys available, but it can also get hard to keep track of how the data is structured as it moves through the system. This is where I find that pulling things out into functions really helps

cat input.json | jq '
        def entry_to_list:
            [ .key, .value.name, .value.create.date ];

        def list_to_string:
          "Details:" + .[0] + ", " +  .[1] + ", " + .[2] ;

       .newList | to_entries | map( entry_to_list | list_to_string )
    '

output:

[
  "Details: 243, test, 2017-08-31 13:57:29",
  "Details: 244, test, 2017-08-31 13:57:29"
]

With this version, you get an intermediate step with values like `["244", "test", "2017-08-31 13:57:29"], which opens up a few options for sorting the results. (Arrays are sorted by the first field first, then by the second field, etc)

  • newList | to_entries | map(entry_to_list) | sort | map(list_to_string) (sort the entries before converting to strings)
  • newList | to_entries | map(entry_to_list) | min | list_to_string (select the lowest entry before converting to string)

Upvotes: 1

Inian
Inian

Reputation: 85580

Assuming your root node name is newList as you have in the question given, to get the id and created date, you can do string interpolation with jq

api-producing-json | jq --raw-output '.newList[] | "\(.id) \(.create.date)"'

This way the filter is independent of the dynamic numbers in the nodes (243, 244). Drop the --raw-output flag if you need the output with quotes.

To process further in a loop, you can iterate it over a bash loop,

while read -r id name date; do
    echo "Do whatever with ${id} ${name} ${date}"
done< <(api-producing-json | jq --raw-output '.newList[] | "\(.id) \(.name) \(.create.date)"')

or to even more carefully delimit the words if you are worried about the spaces in any of the fields, use a delimiter as | in the filter as "\(.id)|\(.name)|\(.create.date)" and use the read command in the while loop by setting IFS=| so the variables are stored appropriately.


Always use the official jq - documentation for help and use this nice playground to test out the filters, jq - online

Upvotes: 25

Related Questions