rldc
rldc

Reputation: 21

Finding nodes with greater than average number of connections

I am trying to find all the nodes in a graph with an above average number of connections, something like:

START n=node(*) 
MATCH n-[r]-() 
WITH n, count(r) AS cnt
WITH n, cnt, avg(cnt) AS av 
WHERE cnt > av
RETURN n, cnt

But this returns 0 rows - I believe that 'av' here is equal to 'cnt'.

I wondered if creating a collection from cnt to pass to avg would help but this just produces an error.

Upvotes: 1

Views: 1228

Answers (2)

Eve Freeman
Eve Freeman

Reputation: 33185

Just chiming in. You can also do this in one query using WITH (even with the same syntax Werner recommends). I also cheated a bit by using length and a pattern to get the count... which some people say is ugly, but it avoids needing to use aggregation which simplifies stuff like this a fair bit.

START n=node(*) 
WITH avg(length(n--())) as avgr 
START n=node(*) 
WHERE length(n--()) > avgr 
RETURN length(n--()) as rc, avgr, n;

http://console.neo4j.org/r/2sp1rt

Upvotes: 1

You have to do this with two queries:

One query, to get the average number of connections:

START n=node(*) 
MATCH n-[r]-() 
WITH n, count(r) as rc 
WITH avg(rc) as avg 
RETURN avg

And then one query, to return node with a relation count above the average:

START n=node(*) 
MATCH n-[r]-() 
WITH n, count(r) as rc 
WITH avg(rc) as {avg_from_first_query} 
RETURN avg

Upvotes: 0

Related Questions