Reputation: 1243
As a sort of validation test, I want to count the number neighbouring nodes with a given labelB for every node with labelA and then return any labelA for each the number of neighbours is not equal to 2.
Basically, parent:parent_name
should always have 2 connected nodes with have label child_name
. How do I return the nodes for which this statement is False?
At the moment I am doing a very time consuming "match everything in neo4j" and then groupby and count in python Pandas.
Cypher:
MATCH (parent: parent_name)
MATCH (parent)-->(child: child_name)
RETURN parent.id, child.id, 'child_name' as child_label, 'parent_name' as parent_label
Pandas post-processing:
grouped = df.groupby(['child_label', 'child.id']).apply(len)
result = grouped[grouped != 2].index # returns pairs of child_label and child.id
I can't do this at scale. Finding neighbours is one of the main use-cases of graphs! There must be a way to do this!
Maybe using UNDWIND
for each parent_name
? If I use count
under an UNWIND
it simply counts the total, not the number for that node...
Upvotes: 0
Views: 690
Reputation: 66999
Here is one way to get the id of each parent
node that has the wrong number of child nodes, along with a (possibly empty) list of its existing child ids:
MATCH (parent:parent_name)
WITH parent.id AS parentId, [(parent)-->(child:child_name) | child.id] AS childIds
WHERE SIZE(childIds) <> 2
RETURN parentId, childIds
Upvotes: 1
Reputation: 5385
a general approach could be along these lines, finding any node with label :LabelA
that has not exactly two neighbours (regardless of direction of the relationship) with label LabelB
MATCH (n:LabelA)
WITH n,
SIZE([(n)--(m:LabelB) | m ]) AS nodeCountLabelB
WHERE nodeCountLabelB <> 2
RETURN n
Upvotes: 1