Reputation: 3
I'm currently trying to get a list of incidents from PagerDuty via the REST api, which returns them in JSON array. I want to remove any duplicate events by using unique_by() on the incident_key. However, I want the first occurrence of the incident_key, and unique_by() is removing all but the last. Right now, if I have incident_number 849, 850, and 851, all with the same incident_key, unique_by() will return 851.
Simple example:
[
{ "reference_key":"200", "id":"1" },
{ "reference_key":"200", "id":"2" },
{ "reference_key":"200", "id":"3" },
{ "reference_key":"201", "id":"4" },
{ "reference_key":"201", "id":"5" },
{ "reference_key":"201", "id":"6" }
]
What I'm trying to do is use unique_by() to get the first occurrence of reference_key, based on the id. So in this case, I'd want the output to be
[
{ "reference_key":"200", "id":"1" },
{ "reference_key":"201", "id":"4" }
]
The problem is that I have no control over this, and with the data I'm currently trying to do this with, it is returning the last occurrence instead of the first, like so.
[
{ "reference_key":"200", "id":"3" },
{ "reference_key":"201", "id":"6" }
]
I have tried using reverse and then calling unique_by(), but I am getting the same results. Is there any way to have some control over this?
Upvotes: 0
Views: 4075
Reputation: 116680
Maybe your version of jq is not sufficiently recent. Using jq 1.5:
unique_by( .reference_key )
yields
[{"reference_key":"200","id":"1"},{"reference_key":"201","id":"4"}]
(As of January 18, 2016 (7835a72), the builtin sort
filter is stable; prior to that, stability was platform-dependent.)
If you don't have access to a sufficiently recent version of jq, then consider the following, which has been tested with jq 1.3, 1.4 and 1.5:
def bucketize(f):
reduce .[] as $x ({}; .[$x|f] += [$x] );
bucketize(.reference_key) | .[][0]
Or much more economically:
reduce .[] as $x ({};
$x.reference_key as $key
| if .[$key] then . else .[$key] = $x end)
| .[]
Upvotes: 4