Lucidnonsense
Lucidnonsense

Reputation: 1243

Neo4j: Count the number of neighbours with a specific label for each node

As a sort of validation test, I want to count the number neighbouring nodes with a given labelB for every node with labelA and then return any labelA for each the number of neighbours is not equal to 2.

Basically, parent:parent_name should always have 2 connected nodes with have label child_name. How do I return the nodes for which this statement is False?

At the moment I am doing a very time consuming "match everything in neo4j" and then groupby and count in python Pandas.

Cypher:

MATCH (parent: parent_name) 
MATCH (parent)-->(child: child_name)
RETURN parent.id, child.id, 'child_name' as child_label, 'parent_name' as parent_label

Pandas post-processing:

grouped = df.groupby(['child_label', 'child.id']).apply(len)
result = grouped[grouped != 2].index  # returns pairs of child_label and child.id

I can't do this at scale. Finding neighbours is one of the main use-cases of graphs! There must be a way to do this!

Maybe using UNDWIND for each parent_name? If I use count under an UNWIND it simply counts the total, not the number for that node...

Upvotes: 0

Views: 690

Answers (2)

cybersam
cybersam

Reputation: 66999

Here is one way to get the id of each parent node that has the wrong number of child nodes, along with a (possibly empty) list of its existing child ids:

MATCH (parent:parent_name)
WITH parent.id AS parentId, [(parent)-->(child:child_name) | child.id] AS childIds
WHERE SIZE(childIds) <> 2
RETURN parentId, childIds

Upvotes: 1

Graphileon
Graphileon

Reputation: 5385

a general approach could be along these lines, finding any node with label :LabelA that has not exactly two neighbours (regardless of direction of the relationship) with label LabelB

MATCH (n:LabelA)
WITH n,
     SIZE([(n)--(m:LabelB) | m ])  AS nodeCountLabelB
WHERE nodeCountLabelB <> 2
RETURN n
   

Upvotes: 1

Related Questions