Reputation: 414
I am working on APOC procedures of neo4j. I am unable to get average of distance for all nodes returned by below query.
match(n:Company)
WHERE NOT (n)-[:refersTo]->() and n.name starts with "google"
with n as company, apoc.text.distance('google', n.name_lower) as distance
return company.name, distance, avg(distance)
The value of distance and avg(distance) is always same. Below is image attached of results sample.
EDIT 1: Tried suggested query:
match(n:Company)
WHERE NOT (n)-[:refersTo]->() and n.name_lower starts with "google"
with n as company, apoc.text.distance('google', n.name_lower) as distance
with company, collect(distance) as distances, avg(distance) as distAvg
unwind distances as distance
return company.name as name, distance, distAvg
Got same results.
Upvotes: 0
Views: 58
Reputation: 66999
This query will return a collection of name/distance pairs and the average of all the distances:
MATCH (n:Company)
WHERE NOT (n)-[:refersTo]->() AND n.name_lower starts with "google"
WITH n.name AS name, apoc.text.distance('google', n.name_lower) AS distance
RETURN COLLECT({name: name, distance: distance}) as data, AVG(distance) as distAvg;
Here is a sample result:
╒══════════════════════════════════════════════════════════════════════╤═════════╕
│"data" │"distAvg"│
╞══════════════════════════════════════════════════════════════════════╪═════════╡
│[{"name":"Google Inc.","distance":5},{"name":"Google Max","distance":4│4.5 │
│}] │ │
└──────────────────────────────────────────────────────────────────────┴─────────┘
Upvotes: 1
Reputation: 30397
Aggregation in Neo4j is with respect to the non-aggregation columns. In your query, for each company and its distance, you're getting the average of that single distance value.
You need to either collapse or remove the distance
values (so the average is with respect to all distances). Try this instead:
match(n:Company)
WHERE NOT (n)-[:refersTo]->() and n.name starts with "google"
with n as company, apoc.text.distance('google', n.name_lower) as distance
with company, collect(distance) as distances, avg(distance) as distAvg
unwind distances as distance
return company.name as name, distance, distAvg
EDIT
As noted by cybersam, this is getting the average distance per company rather than overall. I'll let this stand as an example of how aggregation works and can be tricky. Cybersam's answer provides the correct means of aggregating.
Upvotes: 0