Reputation: 541
I have several groups of json files where each group follows common pattern of data as below:
file 1:
{
"projects": [
{
"id": 15658857,
"code": "111"
},
{
"id": 15623456,
"code": "122"
}
],
"total_entries": 1391,
"links": {
"next": "https://api.xxx.com/projects?page=12&per_page=100",
"last": "https://api.xxx.com/projects?page=14&per_page=100"
}
}
file 2:
{
"projects": [
{
"id": 15658857,
"code": "211"
}
],
"total_entries": 2391,
"links": {
"next": "https://api.xxx.com/projects?page=22&per_page=100",
"last": "https://api.xxx.com/projects?page=24&per_page=100"
}
}
File 3:
{
"projects": [
{
"id": 15658857,
"code": "311"
},
{
"id": 15623456,
"code": "322"
},
{
"id": 13438719,
"code": "333"
}
],
"total_entries": 3391,
"links": {
"next": "https://api.xxx.com/projects?page=32&per_page=100",
"last": "https://api.xxx.com/projects?page=34&per_page=100"
}
}
The above 3 files are sample files of a group and each file in this group has an array element "projects". Other groups have same structure but different array element name. I need to merge all the files of a group into a single file per group. The output of the above files is expected as:
{
"projects": [
{
"id": 15658857,
"code": "111"
},
{
"id": 15623456,
"code": "122"
},
{
"id": 15658857,
"code": "211"
},
{
"id": 15658857,
"code": "311"
},
{
"id": 15623456,
"code": "322"
},
{
"id": 13438719,
"code": "333"
}
],
"total_entries": 1391
}
I used the following jq code to achieve this.
jq -s ".[0].projects=([.[].projects]|flatten)|.[0] | del(.links)" file[123].json
But I am not happy with this as I have to hard code array element name "projects" in this case. I am looking for a solution where array element name doesn't need to be specified, so I can use that expression for every of similar content file. Thanks for the help.
Upvotes: 1
Views: 182
Reputation: 116680
The following is essentially the same as @jq170727's solution, but packages the key abstraction into a function that may be worthy of your standard jq library:
# Gather by key all the values of the objects in a stream
def buckets(stream): reduce stream as $x ({};
reduce ($x|keys_unsorted[]) as $key (.;
.[$key] += [$x[$key]] ) );
With this in place, the solution becomes simply:
buckets(inputs) | map_values(add) | del(.links)
For example, if your standard jq library is in ~/.jq/jq/jq.jq then you could use the following one-liner:
jq -n 'include "jq"; buckets(inputs) | map_values(add) | del(.links)' file{1,2,3}.json
total_entries
OP asked:
what do I need to do if I don't want to add the total_entries from each of the file, I would like to take the value from the first file only
The following modification of the above program will use the first-encountered value for total_entries
:
buckets(inputs)
| . as $buckets
| map_values(add)
| del(.links) + {total_entries: $buckets["total_entries"][0]}
Upvotes: 2
Reputation: 14635
Here is a possible solution assuming your sample data is in file1.json
, file2.json
and file3.json
:
$ jq -Mn '
reduce inputs as $i ({};
reduce ($i|keys[]) as $k (.; .[$k] += $i[$k]))
| del(.links)
' file1.json file2.json file3.json
{
"projects": [
{
"id": 15658857,
"code": "111"
},
{
"id": 15623456,
"code": "122"
},
{
"id": 15658857,
"code": "211"
},
{
"id": 15658857,
"code": "311"
},
{
"id": 15623456,
"code": "322"
},
{
"id": 13438719,
"code": "333"
}
],
"total_entries": 7173
}
Note that this adds the values for total_entries
from each file giving a different total then the one in the requested output.
Upvotes: 2