K2xL
K2xL

Reputation: 10270

How to select items in JQ based on value in array

I have a file with lines like this:

{"items":["blue","green"]}
{"items":["yellow","green"]}
{"items":["blue","pink"]}

How can I use jq to select and show only the JSON values that have "blue" in their "items" array?

So the output would be:

{"items":["blue","green"]}
{"items":["blue","pink"]}

Upvotes: 31

Views: 61793

Answers (4)

peak
peak

Reputation: 116640

On Jan 30, 2017, a builtin named IN was added for efficiently testing whether a JSON entity is contained in a stream. It can also be used for efficiently testing membership in an array. In the present case, the relevant usage would be:

select( .items as $items | "blue" | IN($items[]) )

If your jq does not have IN/1, then so long as your jq has first/1, you can use this equivalent definition:

def IN(s): . as $in | first(if (s == $in) then true else empty end) // false;

any/0

Using any/0 here is relatively inefficient, e.g. compared to using any/1:

select( any( .items[]; . == "blue" ))

(In practice, index/1 is usually fast enough, but its implementation currently (jq 1.5 and versions through at least July 2017) is suboptimal.)

Upvotes: 11

I have needed to use 'regex' for the same situation of the objects. (In another context, of course). I write the code because I did not find a solution for my need in these pages. This can be useful for someone.

For example, to match the blue color using a regular expression:

jq 'select(.items[]|test("bl.*"))' yourfile.json

jqPlay

Upvotes: 5

Jeff Mercado
Jeff Mercado

Reputation: 134801

While what you have certainly works, it would be more correct to use contains. I would avoid that use since it can lead to confusion. index("blue") is 0 and one wouldn't consider that a truthy value and might expect it to be excluded from the results.

Consider using this filter instead:

select(.items | contains(["blue"]))

This has the added benefit that it would work if you wanted items with more than one match by simply adding more to the array.

As Will pointed out in the comments, this isn't quite correct. Strings are compared using substring matching (contains is used recursively) here.

In retrospect, contains didn't work out as I thought it would. Using index works, but personally I wouldn't use it. There's something about figuring out if an item is in a collection by looking for it's index that feels wrong to me. Using contains makes more sense to me, but in light of this information, it wouldn't be ideal in this case.


Here's an alternative that should work correctly:

select([.items[] == "blue"] | any)

Or for a more scalable way if you wanted to be able to match more values:

select(.items as $values | ["blue", "yellow"] | map([$values[] == .] | any) | all)

Upvotes: 5

K2xL
K2xL

Reputation: 10270

Found out the answer

jq 'select(.items | index("blue"))'

Upvotes: 28

Related Questions