Reputation: 23
I am attempting to parse a JSON structure to extract a dependency path, for use in an automation script.
The structure of this JSON is extracted to a format like this:
[
{
"Id": "abc",
"Dependencies": [
]
},
{
"Id": "def",
"Dependencies": [
"abc"
]
},
{
"Id": "ghi",
"Dependencies": [
"def"
]
}
]
Note: Lots of other irrelevant fields removed.
The plan is to be able to pass into my JQ command the Id of one of these and get back out a list.
Eg:
Input: abc
Expected Output: []
Input: def
Expected Output: ["abc"]
Input: ghi
Expected Output: ["abc", "def"]
Currently have a jq script like this (https://jqplay.org/s/NAhuXNYXXO):
jq
'. as $original | .[] |
select(.Id == "INPUTVARIABLE") |
[.Dependencies[]] as $level1Dep | [$original[] | select( [ .Id == $level1Dep[] ] | any )] as $level1Full | $level1Full[] |
[.Dependencies[]] as $level2Dep | [$original[] | select ( [ .Id == $level2Dep[] ] | any )] as $level2Full |
[$level1Dep[], $level2Dep[]]'
Input: abc
Output: empty
Input: def
Output: ["abc"]
Input: ghi
Output: ["def","abc"]
Great! However, as you can see this is not particularly scale-able and will only handle two dependency levels (https://jqplay.org/s/Zs0xIvJ2Zn), and also falls apart horribly when there are multiple dependencies on an item (https://jqplay.org/s/eB9zHQSH2r).
Is there a way of constructing this within JQ or do I need to move out to a different language?
Upvotes: 2
Views: 742
Reputation: 116750
Here's a short program that handles circular dependencies efficiently and illustrates how a subfunction can be defined after the creation of a local variable (here, $next) for efficiency:
def dependents($x):
(map( {(.Id): .Dependencies}) | add) as $next
# Input: array of dependents computed so far
# Output: array of all dependents
| def tc($x):
($next[$x] - .) as $new
| if $new == [] then .
else (. + $new | unique)
# avoid calling unique again:
| . + ([tc($new[])[]] - .)
end ;
[] | tc($x);
dependents($start)
With the given input and an invocation such as
jq --arg start START -f program.jq input.json
the output for various values of START is:
START output
abc []
def ["abc"]
ghi ["def", "abc"]
If the output must be sorted, then simply add a call to sort
.
Upvotes: 1
Reputation: 50760
I know that the data cannot have circular dependencies, it is pulled from a database that enforces this.
It's trivial then. Reduce your input JSON down to an object where each Id and corresponding Dependencies array are paired, and walk through it aggregating dependencies using a recursive function.
def deps($depdb; $id):
def _deps($id): $depdb[$id] // empty
| . + map(_deps(.)[]);
_deps($id);
deps(map({(.Id): .Dependencies}) | add; $fid)
Invocation:
jq -c --arg fid 'ghi' -f prog.jq file
Online demo - arbitrary dependency levels
Online demo - multiple dependencies per Id
Upvotes: 2