Reputation: 10270
I have a file with lines like this:
{"items":["blue","green"]}
{"items":["yellow","green"]}
{"items":["blue","pink"]}
How can I use jq to select and show only the JSON values that have "blue" in their "items" array?
So the output would be:
{"items":["blue","green"]}
{"items":["blue","pink"]}
Upvotes: 31
Views: 61793
Reputation: 116640
On Jan 30, 2017, a builtin named IN
was added for efficiently testing whether a JSON entity is contained in a stream. It can also be used for efficiently testing membership in an array. In the present case, the relevant usage would be:
select( .items as $items | "blue" | IN($items[]) )
If your jq does not have IN/1
, then so long as your jq has first/1
, you can use this equivalent definition:
def IN(s): . as $in | first(if (s == $in) then true else empty end) // false;
Using any/0
here is relatively inefficient, e.g. compared to using any/1
:
select( any( .items[]; . == "blue" ))
(In practice, index/1
is usually fast enough, but its implementation currently (jq 1.5 and versions through at least July 2017) is suboptimal.)
Upvotes: 11
Reputation: 171
I have needed to use 'regex' for the same situation of the objects. (In another context, of course). I write the code because I did not find a solution for my need in these pages. This can be useful for someone.
For example, to match the blue color using a regular expression:
jq 'select(.items[]|test("bl.*"))' yourfile.json
Upvotes: 5
Reputation: 134801
While what you have certainly works, it would be more correct to use contains
. I would avoid that use since it can lead to confusion. index("blue")
is 0
and one wouldn't consider that a truthy value and might expect it to be excluded from the results.
Consider using this filter instead:
select(.items | contains(["blue"]))
This has the added benefit that it would work if you wanted items with more than one match by simply adding more to the array.
As Will pointed out in the comments, this isn't quite correct. Strings are compared using substring matching (contains
is used recursively) here.
In retrospect, contains
didn't work out as I thought it would. Using index
works, but personally I wouldn't use it. There's something about figuring out if an item is in a collection by looking for it's index that feels wrong to me. Using contains
makes more sense to me, but in light of this information, it wouldn't be ideal in this case.
Here's an alternative that should work correctly:
select([.items[] == "blue"] | any)
Or for a more scalable way if you wanted to be able to match more values:
select(.items as $values | ["blue", "yellow"] | map([$values[] == .] | any) | all)
Upvotes: 5