Reputation: 73
I am brand new to Gremlin and graph databases in general. I have been tasked with producing a plan in which to execute a task based on the depth at which the node exists in the subgraph.
So far, I can produce the subgraph and a list of paths based on that subgraph. I am stuck when I try to add depths to the paths, especially when some nodes exist in multiple paths. You can see here the graph I am testing against:
public void generateTestGraph(){
Vertex web = this.sqlgGraph.addVertex(T.label, "Server", "name", "WEB");
Vertex app = this.sqlgGraph.addVertex(T.label, "Server", "name", "APP1");
Vertex app2 = this.sqlgGraph.addVertex(T.label, "Server", "name", "APP2");
Vertex app3 = this.sqlgGraph.addVertex(T.label, "Server", "name", "APP3");
Vertex sql = this.sqlgGraph.addVertex(T.label, "Server", "name", "SQL");
web.addEdge("DEPENDS_ON", app);
web.addEdge("DEPENDS_ON", app2);
app.addEdge("DEPENDS_ON", sql);
app3.addEdge("DEPENDS_ON", sql);
app2.addEdge("DEPENDS_ON", app3);
this.sqlgGraph.tx().commit();
}
I can produce path lists like so:
public void testDepGraph(){
String name = "SQL";
Graph subGraph = (Graph) this.sqlgGraph.traversal().V().has("name", name).repeat(__.inE("DEPENDS_ON")
.subgraph("subGraph").outV()).in("DEPENDS_ON").loops().is(P.gt(50)).cap("subGraph").next();
Object vl = subGraph.traversal().V().has("name", name).repeat(__.in("DEPENDS_ON")).emit(__.not(__.inE())).path().by("name").toList();
}
But, they look like this:
{
["SQL", "APP1", "WEB"],
["SQL", "APP3", "APP2", "WEB"]
}
Which is close, but I would like to see the output look like this:
{
0: {"SQL"},
1: {"APP1", "APP3"},
2: {"APP2"},
3: {"WEB"}
}
So, my question is: Can I get the results I want by using Gremlin directly, or will I have to do some post-processing by hand(code)?
Upvotes: 0
Views: 567
Reputation: 10904
It can be done using a single query, but it's pretty complex and might be hard to grasp if you just started to use Gremlin.
gremlin> g.V().group("x").
by("name").
by(out("DEPENDS_ON").values("name").fold()).
barrier().
repeat(
cap("x").unfold().
filter(select(values).not(unfold().where(without("y")))).
select(keys).where(without("y")).
aggregate("y").
aggregate("z").
by(project("a","b").
by().
by(coalesce(select("z").unfold().select("b").order().by(decr).limit(1).
sack(assign).sack(sum).by(constant(1)).sack(),
constant(0))))
).cap("z").unfold().group().by(select("b")).by(select("a").fold())
==>[0:[SQL],1:[APP3,APP1],2:[APP2],3:[WEB]]
This query creates a dependency map x
and then loops through the map, picking only those entries that have dependencies to components that were processed in previous iterations. repeat()
will terminate once there are no unprocessed components left. After each iteration processed components will be stored in a list named y
and and the same component will be stored as a tuple (together with the iteration index) in a list named z
.
The list named z
will eventually be used to create the desired map result, where the iteration index is the map's key and the components are the respective values.
Upvotes: 4