artfulrobot
artfulrobot

Reputation: 21427

Create object from stream of objects keyed by a property in jq

This is a question about the command line json processor jq. It is not about javascript or jQuery or anything else with js and qs in its name :-)

I have input data like:

{ "id": "person1", "name": "wilma", "age": "quite old"}
{ "id": "person2", "name": "fred"}
{ "id": "person1", "name": "betty", "x": "extra"}

I want output like this:

{
   "person1": { "name": "betty", "age": "quite old", "x": "extra" },
   "person2": { "name": "fred" }
}

I have tried various things!

e.g.

jq -s '.[] | { (.id) : . }' <data

gives

{ "person1": { "id": "person1", "name": "wilma", "age": "quite old" }}
{ "person2": { "id": "person2", "name": "fred" }}
{ "person1": { "id": "person1", "name": "betty", "x": "extra" }}

Which is sort of there, except it's outputting a stream of objects instead of just one. I need to merge all those objects together.

jqplay.org example

I've also tried using group_by(.id)[]|add which merges each item but still results in a stream. https://jqplay.org/s/lh6QUQ0DO4

Upvotes: 2

Views: 1284

Answers (2)

artfulrobot
artfulrobot

Reputation: 21427

Ah! I've got it! Or I've got one solution - please post if there's a better way.

jq -s '[group_by(.id)[]| add | { (.id) : . } ]|add' <data

https://jqplay.org/s/BfAdRBZUMW

  1. group_by groups the inputs by their .id value and produces an array of arrays - the inner arrays are the values that match on id.

  2. for each group the inner arrays are passed to add which, because the things in the inner arrays are objects, merges them.

  3. That leaves a 2 item array. We feed that to an object constructor which plucks the id as the key and the whole item as the value. This still leaves an array of items.

  4. the outer [] (starts at start of pattern) says take all those and feed it to add (again), which merges the final objects created in (3).

It works, but there may be a cleaner way.

EDIT

This is uglier but produces the same result and is ~24% faster on a 9MB dataset.

jq -s 'reduce [.[]|{ (.id) : . }][] as $item ({}; . * $item )' <data

This uses reduce <list> as <$var> (<initiation>; <iteration>) starting with an empty object {} and using the merge operator * starting from the incoming item . to create the output. I'm surprised it's faster, but I understand that group_by does a sort, so I guess that's an additional time cost.

Upvotes: 1

peak
peak

Reputation: 116957

You could tweak your attempt as follows:

jq -s 'map({ (.id) : . }) | add' <data

However, it would be more efficient to use inputs and reduce with the -n command-line option instead of -s.

Of course, using this approach runs the risk of collisions.

You might also want to add del(.id)

Upvotes: 2

Related Questions