Olya Orlova
Olya Orlova

Reputation: 25

JQ to parse nested arrays

I have an issue to get data from JSON:

{
  "operations": [
    {
      "operationName": "GetValue",
      "batch_size": "2",
      "orders": [
        {
          "clientId": "7836",
          "validation_time": {
            "place": "136",
            "execute": "5379"
          }
        },
        {
          "clientId": "7837",
          "validation_time": {
            "place": "145",
            "execute": "5401"
          }
        }
      ]
    },
    {
      "operationName": "GetValue",
      "batch_size": "3",
      "orders": [
        {
          "clientId": "7838",
          "validation_time": {
            "place": "122",
            "execute": "5201"
          }
        },
        {
          "clientId": "7839",
          "validation_time": {
            "place": "122",
            "execute": "5201"
          }
        },
        {
          "clientId": "7840",
          "validation_time": {
            "place": "122",
            "execute": "5201"
          }
        }
      ]
    }
  ]
}

As a result, I wanna get info per order including clientId as key, place_validation_time and execute_validation_time per order. I tried the following:

.operations[] | select(.operationName=="GetValue") | {key: .orders[].clientId, place_validation_time: .orders[].validation_time.place, execute_validation_time: .orders[].validation_time.execute, batch_size: .batch_size}

But it has an unexpected result for me, it combined all clientId with all variations of validation_time.place and validation_time.execute, so instead of 1 object per order I've got 4 (for batch_size=2) e.g. Expected:

{
  "key": "7836",
  "place_validation_time": "136",
  "execute_validation_time": "5379",
  "batch_size": "2"
}

Actual I have:

{
  "key": "7836",
  "place_validation_time": "136",
  "execute_validation_time": "5379",
  "batch_size": "2"
}
{
  "key": "7836",
  "place_validation_time": "136",
  "execute_validation_time": "5401",
  "batch_size": "2"
}
{
  "key": "7836",
  "place_validation_time": "145",
  "execute_validation_time": "5379",
  "batch_size": "2"
}
{
  "key": "7836",
  "place_validation_time": "145",
  "execute_validation_time": "5401",
  "batch_size": "2"
}

Does anybody have ideas how to avoid it and get as a result 1 object per 1 order?

Upvotes: 1

Views: 1067

Answers (1)

peak
peak

Reputation: 116957

The cartesian-product behavior is the result of including .orders[] at multiple places inside the object-construction expression. Hoist it instead. Assuming you want to select objects with .batch_size equal to "2", you could write:

.operations[]
| select(.operationName=="GetValue")
| .batch_size as $batch_size
| select($batch_size == "2")
| .orders[]
| {key: .clientId,
   place_validation_time: .validation_time.place,
   execute_validation_time: .validation_time.execute,
   batch_size: $batch_size}

Addendum

If you have other items of interest at the .batch_size level, you would probably want to use a filter more like this:

.operations[]
| select(.operationName=="GetValue")
| . as $it
| select($it.batch_size == "2")
| .orders[]
| {key: .clientId,
   place_validation_time: .validation_time.place,
   execute_validation_time: .validation_time.execute,
   batch_size: $it.batch_size}

Upvotes: 2

Related Questions