Reputation: 531
I have a large graph and exactly 2 days of Neo4J under my belt, i.e. zilch.
I want to compute an average over all nodes of a certain numerical properties, say n.prop_01
whenever it occurs in nodes (n)
. Then I want to get the nodes whose property n.prop_01
is below a certain threshold depending on the average.
I could not find an answer or something inspiring in the documentation or on online forums. I tried what follows and more...
MATCH (n) WHERE exists(n.prop_01)
WITH n, collect(n.prop_01) AS all_prop_01
UNWIND all_prop_01 as all_prop
WITH n,prop_01,avg(all_prop) AS avg_prop
WHERE n.prop_01 < (1.1*avg_prop)
RETURN n.name,n.prop_01 LIMIT 20;
Result:(no changes, no records)
.
Thanks for any pointers as to why this does not work and how I could make it work.
Upvotes: 2
Views: 118
Reputation: 11216
You could compute the average upstream when you are matching your nodes with prop_01
.
You should also consider adding a label to the match to reduce the number of nodes your query needs to look through.
You can also collect the n
nodes rather than just prop_01
and use that in your unwind later.
MATCH (n:Add_A_Node_Label_Here)
WHERE exists(n.prop_01)
WITH collect(n) AS all_n, avg(n.prop_01) as avg_prop
UNWIND all_n as n
WITH n, avg_prop
WHERE n.prop_01 < (1.1 * avg_prop)
RETURN n.name, n.prop_01
LIMIT 20;
Upvotes: 1
Reputation: 29172
You need to first calculate the average, and then filter, otherwise when you unwind, the average is equal to the property of each node:
MATCH (n) WHERE exists(n.prop_01)
WITH avg(n.prop_01) as avg_prop
MATCH (n) WHERE exists(n.prop_01) AND n.prop_01 < 1.1 * avg_prop
RETURN count(n), avg_prop
Upvotes: 2