Nirmal Vatsyayan
Nirmal Vatsyayan

Reputation: 414

Neo4j aggregation for dynamic values

I am working on APOC procedures of neo4j. I am unable to get average of distance for all nodes returned by below query.

match(n:Company)  
WHERE NOT (n)-[:refersTo]->() and n.name starts with "google" 
with n as company, apoc.text.distance('google', n.name_lower) as distance  
return company.name, distance, avg(distance)

The value of distance and avg(distance) is always same. Below is image attached of results sample.

enter image description here

EDIT 1: Tried suggested query:

match(n:Company)  
WHERE NOT (n)-[:refersTo]->() and n.name_lower starts with "google" 
with n as company, apoc.text.distance('google', n.name_lower) as distance  
with company, collect(distance) as distances, avg(distance) as distAvg
unwind distances as distance
return company.name as name, distance, distAvg

Got same results.

enter image description here

Upvotes: 0

Views: 58

Answers (2)

cybersam
cybersam

Reputation: 66999

This query will return a collection of name/distance pairs and the average of all the distances:

MATCH (n:Company)  
WHERE NOT (n)-[:refersTo]->() AND n.name_lower starts with "google"
WITH n.name AS name, apoc.text.distance('google', n.name_lower) AS distance
RETURN COLLECT({name: name, distance: distance}) as data, AVG(distance) as distAvg;

Here is a sample result:

╒══════════════════════════════════════════════════════════════════════╤═════════╕
│"data"                                                                │"distAvg"│
╞══════════════════════════════════════════════════════════════════════╪═════════╡
│[{"name":"Google Inc.","distance":5},{"name":"Google Max","distance":4│4.5      │
│}]                                                                    │         │
└──────────────────────────────────────────────────────────────────────┴─────────┘

Upvotes: 1

InverseFalcon
InverseFalcon

Reputation: 30397

Aggregation in Neo4j is with respect to the non-aggregation columns. In your query, for each company and its distance, you're getting the average of that single distance value.

You need to either collapse or remove the distance values (so the average is with respect to all distances). Try this instead:

match(n:Company)  
WHERE NOT (n)-[:refersTo]->() and n.name starts with "google" 
with n as company, apoc.text.distance('google', n.name_lower) as distance  
with company, collect(distance) as distances, avg(distance) as distAvg
unwind distances as distance
return company.name as name, distance, distAvg

EDIT

As noted by cybersam, this is getting the average distance per company rather than overall. I'll let this stand as an example of how aggregation works and can be tricky. Cybersam's answer provides the correct means of aggregating.

Upvotes: 0

Related Questions