Stephanie
Stephanie

Reputation: 146

Neo4j: Iterating from leaf to parent AND finding common children

I've migrated my relational database to neo4j and am studying whether I can implement some functionalities before I commit to the new system. I just read two neo4j books, but unfortunately they don't cover two key features I was hoping would be more self-evident. I'd be most grateful for some quick advice on whether these things will be easy to implement or whether I should stick to sql! Thx!

Features I need are: 1) I have run a script to assign :leaf label to all nodes that are leaves in my tree. In paths between a known node and its related leaf nodes, I aim to assign to every node a level property that reflects how many hops that node is from the known node (or leaf node - whatever I can get to work most easily).

I tried: match path=(n:Leaf)-[:R*]->(:Parent {Parent_ID: $known_value}) with n, length(nodes(path)) as hops set n.Level2=hops;

and path=(n:Leaf)-[:R*]->(:Parent {Parent_ID: $known_value}) with n, path, length(nodes(path)) as hops foreach (n IN relationships (path) | set n.Level=hops);

The first assigns property with value of full length of path to only leaf nodes. The second assigns property with value of full length of path to all relationships in path.

Should I be using shortestpath instead, create a bogus property with value =1 for all nodes and iteratively add weight of that property?

2) I need to find the common children for a given parent node. For example, my children each [:like] lots of movies, and I would like to create [:like] relationships from myself to just the movies that my children all like in common (so if 1 of 1 likes a movie, then I like it too, but if only 2 of 3 like a movie, nothing happens).

I found a solution with three paths here: Need only common nodes across multiple paths - Neo4j Cypher But I need a solution that works for any number of paths (starting from 1).

3) Then I plan to start at my furthest leaf nodes, create relationships to children's movies, and move level by level toward my known node and repeat create relationships, so that the top-most grandparent likes only the movies that all children [of all children of all children...] like in common and if there's one that everybody agrees on, that's the movie the entire extended family will watch Saturday night.

Can this be done with neo4j and how hard a task is it for someone with rudimentary Cypher? This is mostly how I did it in my relational database / Should I be looking at implementing this totally differently in graph database?

Most grateful for any advice. Thanks!

Upvotes: 1

Views: 874

Answers (1)

InverseFalcon
InverseFalcon

Reputation: 30417

1.

shortestPath() may help when your already matched start and end nodes are not the root and the leaf, in that it won't continue to look for additional paths once the first is found. If your already matched start and end nodes are the root and the leaf when the graph is a tree structure (acyclic), there's no real reason to use shortestPath().

Typically when setting something like the depth of a node in a tree, you would use length(path), so the root would be at depth 0, its children at depth 1.

Usually depth is calculated with respect to the root node and not leaf nodes (as an intermediate node may be the ancestor of multiple leaf nodes at differing distances). Taking the depth as the distance from the root makes the depths consistent.

Your approach with setting the property on relationships will be a problem, as the same relationship can be present in multiple paths for multiple leaf nodes at varying depths. Your query could overwrite the property on the same relationship over and over until the last write wins. It would be better to match down to all nodes (leave out :Leaf in the query), take the last relationship in the path, and set its depth:

MATCH path=(:Parent {Parent_ID: $known_value})<-[:R*]-()
WITH length(path) as length, last(relationships(path)) as rel
SET rel.Level = length

2.

So if all child nodes of a parent in the tree :like a movie then the parent should :like the movie. Something like this should work:

MATCH path=(:Parent {Parent_ID: $known_value})<-[:R*0..]-(n)
WITH n, size((n)<-[:R]-()) as childCount
MATCH (n)<-[:R]-()-[:like]->(m:Movie)
WITH n, childCount, m, count(m) as movieLikes
WHERE childCount = movieLikes
MERGE (n)-[:like]->(m)

The idea here is that for a movie, if the count of that movie node equals the count of the child nodes then all of the children liked the movie (provided that a node can only :like the same movie once).

This query can't be used to build up likes from the bottom up however, the like relationships (liking personally, as opposed to liking because all children liked it) would have to be present on all nodes first for this query to work.

3.

In order to do a bottom-up approach, you would need to force the query to execute in a particular order, and I believe the best way to do that is to first order the nodes to process in depth order, then use apoc.cypher.doIt(), a proc in APOC Procedures which lets you execute an entire Cypher query per row, to do the calculation.

This approach should work:

MATCH path=(:Parent {Parent_ID: $known_value})<-[:R*0..]-(n)
WHERE NOT n:Leaf // leaves should have :like relationships already created
WITH n, length(path) as depth, size((n)<-[:R]-()) as childCount
ORDER BY depth DESC
CALL apoc.cypher.doIt("
 MATCH (n)<-[:R]-()-[:like]->(m:Movie)
 WITH n, childCount, m, count(m) as movieLikes
 WHERE childCount = movieLikes
 MERGE (n)-[:like]->(m)
 RETURN count(m) as relsCreated",
 {n:n, childCount:childCount}) YIELD value
RETURN sum(value.relsCreated) as relsCreated

That said, I'm not sure this will do what you think it will do. Or rather, it will only work the way you think it will if the only :like relationships to movies are initially set on just the leaf nodes, and (prior to running this propagation query) no other intermediate node in the tree has any :like relationship to a movie.

Upvotes: 1

Related Questions