Reputation: 33
I have a graph's database in gremlin with shape like this image:
I need help to build a query to get results between all "Persons", with the edge as a count of all "Events" in common. The result should be something like this:
{
nodes: [
{id:"PersonA", label: "Person A"},
{id:"PersonB", label: "Person B"},
{id:"PersonC", label: "Person C"},
{id:"PersonD", label: "Person D"},
{id:"PersonE", label: "Person E"},
{id:"PersonF", label: "Person F"},
],
edges: [
{from: "PersonA", to: "PersonB", label: 1},
{from: "PersonA", to: "PersonC", label: 2},
{from: "PersonA", to: "PersonD", label: 2},
{from: "PersonA", to: "PersonE", label: 1},
{from: "PersonA", to: "PersonF", label: 1},
{from: "PersonB", to: "PersonC", label: 1},
{from: "PersonB", to: "PersonD", label: 1},
{from: "PersonC", to: "PersonD", label: 2},
{from: "PersonC", to: "PersonE", label: 1},
{from: "PersonC", to: "PersonF", label: 1},
{from: "PersonD", to: "PersonE", label: 1},
{from: "PersonD", to: "PersonF", label: 1},
{from: "PersonE", to: "PersonF", label: 1}
]
}
I'm struggling on this for a few hours and can't something close for what I'm looking for.
Upvotes: 2
Views: 151
Reputation: 46206
A picture is nice, but when asking questions about Gremlin it's best to provide a Gremlin script to create your data:
g.addV('person').property(id,'a').as('a').
addV('person').property(id,'b').as('b').
addV('person').property(id,'c').as('c').
addV('person').property(id,'d').as('d').
addV('person').property(id,'e').as('e').
addV('person').property(id,'f').as('f').
addV('event').property(id,'1').as('1').
addV('event').property(id,'2').as('2').
addE('attends').from('a').to('1').
addE('attends').from('a').to('2').
addE('attends').from('b').to('2').
addE('attends').from('c').to('1').
addE('attends').from('c').to('2').
addE('attends').from('d').to('1').
addE('attends').from('d').to('2').
addE('attends').from('e').to('1').
addE('attends').from('f').to('1').iterate()
I went with this approach to solve your problem:
g.V().hasLabel('person').as('s').
out().in().
where(neq('s')).
path().by(id).
groupCount().
by(union(limit(local,1),tail(local,1)).fold()).
unfold().
dedup().
by(select(keys).order(local)).
order().
by(select(keys).limit(local,1)).
by(select(keys).tail(local,1))
which produces the output your seeking:
gremlin> g.V().hasLabel('person').as('s').
......1> out().in().
......2> where(neq('s')).
......3> path().by(id).
......4> groupCount().
......5> by(union(limit(local,1),tail(local,1)).fold()).
......6> unfold().
......7> dedup().
......8> by(select(keys).order(local)).
......9> order().by(select(keys).limit(local,1))
==>[a, b]=1
==>[a, e]=1
==>[a, c]=2
==>[a, d]=2
==>[a, f]=1
==>[b, c]=1
==>[b, d]=1
==>[c, d]=2
==>[c, e]=1
==>[c, f]=1
==>[d, e]=1
==>[d, f]=1
==>[e, f]=1
The approach above utilizes path()
to gather the "person->event<-person" that Gremlin travels over and avoid retracing steps with where(neq('s'))
. It then does a groupCount()
by the "person" vertices which represent the person pairs. We now have a Map
with the person pairs and their counts as you want but it needs a bit of post-processing so we unfold()
the Map
to key-value pairs. The first step is to dedup()
by the person pairs as the Map
currently contains things like "a->b" and "b->a" and we don't need both, so deduping by the ordered list of those pairs, will give us the unique list. Finally, we add some order()
to make the results look exactly like yours.
I suppose you could try to dedup()
immediately after the path()
and avoid some extra work in groupCount()
.
Upvotes: 2