Reputation: 581
I have a Titan graph database witha set of vertices connected by an edge with a property named "property1".
Is it possible to write a Gremlin (or anything else Titan would support) query to:
Find all edges that have a value for "property1" that is seen 5 or less times.
In SQL I would use "Group By", in MongoDB I would use one of the aggregate functions.
I am thinking this may be a job for Furnace/Faunus?
Upvotes: 1
Views: 1385
Reputation: 46206
You can do this by iterating all edges and using groupBy
. Here's an example with the toy graph using weight
in place of property1
:
gremlin> g = TinkerGraphFactory.createTinkerGraph()
==>tinkergraph[vertices:6 edges:6]
gremlin> g.E.groupBy{it.weight}{it}.cap.next()
==>0.5=[e[7][1-knows->2]]
==>1.0=[e[8][1-knows->4], e[10][4-created->5]]
==>0.4=[e[11][4-created->3], e[9][1-created->3]]
==>0.2=[e[12][6-created->3]]
So that groups all edges by their weight
. From there you can drop down to standard groovy functions like findAll
to filter out what you don't want (here i filter out weights that have >1
edge in them...in your case it would be <5
).
gremlin> g.E.groupBy{it.weight}{it}.cap.next().findAll{k,v->v.size()>1}
==>1.0=[e[8][1-knows->4], e[10][4-created->5]]
==>0.4=[e[11][4-created->3], e[9][1-created->3]]
Obviously this is a bit of an expensive operation on a really large graph as you have a lot of iteration to do over edges and you have to build up a Map
in memory which could be big depending on the diversity of the values in property1
. If you can find ways to limit edge iteration with other filters, that might be helpful.
This would be a good job for Faunus if you had a really large graph. I'll go with the easy answer here and simply say that you don't necessarily want the specific edges with a property1
value occurring less than 5 times and that you just want to know how many times different property1
values occur. With Faunus you could get a distribution like that with:
g.E.property1.groupCount()
Upvotes: 1