blahreport
blahreport

Reputation: 1080

jq to filter inner array elements but return the whole JSON

TL;DR

How can I return the whole JSON after filtering inner array elements of a top-level key?

Detailed explanation

I have a JSON describing the COCO image database and it is formatted as follows (irrelevant elements truncated as ...).

{
  "info": {
    "description": "COCO 2017 Dataset",
    ...
  },
  "licenses": [
    {
      "url": "http://creativecommons.org/licenses/by-nc-sa/2.0/",
      ...
    },
    ...
  ],
  "images": [
    {
      "license": 4,
      ...
    },
  "annotations": [
    {
      "segmentation": [
        [
          510.66,
          ...
        ]
      ],
      "area": 702.1057499999998,
      "iscrowd": 0,
      "image_id": 289343,
      "bbox": [
        473.07,
        395.93,
        38.65,
        28.67
      ],
      "category_id": 18,
      "id": 1768
    },
  "categories": [
    {
      "supercategory": "person",
      ...
    },
  ]
}

I need to filter annotations where category_id has one of several values, for example 1, 2.

I can successfully filter such category_ids with

jq -C ' .annotations[] | select( .category_id == 1 or .category_id == 2 ) ' instances_val2017.json | less -R

However, what is returned are only the annotations element of the total JSON as below.

{
  "segmentation": [
    [
      162.72,
      ...
    ]
  ],
  "area": 426.9120499999995,
  "iscrowd": 0,
  "image_id": 45596,
  "bbox": [
    161.52,
    507.18,
    46.45,
    19.16
  ],
  "category_id": 2,
  "id": 124742
}
{
...
{

I know it's possible to return these elements as an array by wrapping the expression in [] but how can I return the entire original JSON after filtering the specified category ids?

Upvotes: 1

Views: 577

Answers (1)

blahreport
blahreport

Reputation: 1080

Okay I spent 3 hours trying to solve this yesterday then this morning I posted this question and subsequently figured it out!

Here is the solution which uses the |= operator which modifies an element in place.

jq '.annotations |= map(select(.category_id | contains(1,2)))' instances_val2017.json

As per the suggestion of @peak, here is the command with == instead of contains.

jq '.annotations |= map(select(.category_id == (1,2)))' instances_val2017.json

Upvotes: 3

Related Questions