mr s
mr s

Reputation: 73

Summing data up a tree data structure (in a graph database)

I have a tree data structure in which data is entered at the leaf nodes. What I want to do is sum this data up the tree e.g. for any node in the tree, sum up all the data underneath it.

Are there any clever ways to do this using a graph database?

Upvotes: 2

Views: 755

Answers (2)

stephen mallette
stephen mallette

Reputation: 46206

Here's a solution using Gremlin and a tree data sample that just popped up on the aureliusgraphs mailing list:

g = new TinkerGraph()
root = g.addVertex(['name':'root'])
c1   = g.addVertex(['name':'1',    'sortIndex':1])
c11  = g.addVertex(['name':'1.1',  'sortIndex':1])
c12  = g.addVertex(['name':'1.2',  'sortIndex':2])
c121 = g.addVertex(['name':'1.2.1','sortIndex':1])
c122 = g.addVertex(['name':'1.2.2','sortIndex':2])
c13  = g.addVertex(['name':'1.3',  'sortIndex':3])
c2   = g.addVertex(['name':'2',    'sortIndex':2])
c21  = g.addVertex(['name':'2.1',  'sortIndex':1])
c22  = g.addVertex(['name':'2.2',  'sortIndex':2])
c3   = g.addVertex(['name':'3',    'sortIndex':3])
c31  = g.addVertex(['name':'3.1',  'sortIndex':1])
c32  = g.addVertex(['name':'3.2',  'sortIndex':2])
g.addEdge(root, c1, 'has')
g.addEdge(root, c2, 'has')
g.addEdge(root, c3, 'has')
g.addEdge(c1, c11, 'has')
g.addEdge(c1, c12, 'has')
g.addEdge(c1, c13, 'has')
g.addEdge(c2, c21, 'has')
g.addEdge(c2, c22, 'has')
g.addEdge(c3, c31, 'has')
g.addEdge(c3, c32, 'has')
g.addEdge(c12, c121, 'has')
g.addEdge(c12, c122, 'has')

The above code to initialize the graph should paste nicely into a Gremlin prompt. Once the graph is established, simply issue this command to traverse the tree and sum the numeric value (in this case the sortIndex field):

gremlin> total=0;root.out.sideEffect{total+=it.sortIndex}.loop(2){true}
gremlin> total
==>21

The above code initializes a total variable and then begins the traversal from the root vertex, traversing out and then adding the value of the sortIndex field to the total. It then loops/repeats that operation over the tree until it exhausts it (the final true controls how long it loops for).

I used a TinkerGraph for convenience, but this code will work with Neo4j or OrientDB (I saw those as other tags on this question), by simply changing the graph implementation as follows:

g = new Neo4jGraph('/tmp/neo4j')

or

g = new OrientGraph('memory:/graph')

Upvotes: 1

Eve Freeman
Eve Freeman

Reputation: 33145

In cypher (neo4j), you can do something like this:

start n=node:node_auto_index(id={id})
match n-[:parent_of*]->child
where not(child-[:parent_of]->()) // where the child doesn't have children (leaf)
return n, sum(child.val);

http://wes.skeweredrook.com/cypher-summing-data-up-a-tree-leaf-nodes/

Upvotes: 1

Related Questions