Neo4j querying over 2 relationships and compute an aggregate function on the 2 relationship

Question

I have this data model:

With nodes:

Client
Vendor
Product

And the relationships:

Client - Recommend {qualification} -> Product
Client - Buy -> Product
Vendor - Sell -> Product

And i am trying to get the top sell products with the average qualification, i actually try this query:

MATCH (p:Product)<-[b:Buy]-(c:Client)
CALL{
    WITH c, p
    MATCH (c)-[r:Recommend]->(p)
    RETURN avg(r.qualification) as average_qualification
} 
RETURN p, c, count(b) as qty, average_qualification 
ORDER BY qty DESC

But the query return a row for each average_qualification per Client (Something like this):

But i want to group per Product, so it needs to merge the rows where the products are the same, so for example the rows 1 and 4 will be merge and the average_qualification will be the average_qualification of the product (not divided by clients qualifications).

nimrod serok · Accepted Answer

You can do something like:

MATCH (p:Product)<-[b:Buy]-(:Client)
WITH p, count(b) AS qty
MATCH (:Client)-[r:Recommend]->(p)
RETURN p, qty, avg(r.qualification) AS average_qualification
ORDER BY qty DESC

Which with this sample data:

MERGE (a:Client{name: 'A'})
MERGE (b:Client{name: 'B'})
MERGE (c:Client{name: 'C'})
MERGE (d:Client{name: 'D'})
MERGE (e:Client{name: 'E'})
MERGE (f:Vendor{key: 2})
MERGE (g:Vendor{key: 3})
MERGE (h:Vendor{key: 4})
MERGE (j:Product{key: 5})
MERGE (i:Product{key: 6})
MERGE (k:Product{key: 7})
MERGE (l:Product{key: 8})
MERGE (m:Product{key: 9})

MERGE (a)-[:Recommend{qualification: 4}]-(j)
MERGE (a)-[:Recommend{qualification: 4}]-(k)
MERGE (c)-[:Recommend{qualification: 3}]-(i)
MERGE (e)-[:Recommend{qualification: 3}]-(j)
MERGE (a)-[:Buy]-(k)
MERGE (a)-[:Buy]-(j)
MERGE (b)-[:Buy]-(l)
MERGE (d)-[:Buy]-(m)
MERGE (c)-[:Buy]-(i)
MERGE (d)-[:Buy]-(i)
MERGE (e)-[:Buy]-(i)
MERGE (e)-[:Buy]-(j)
MERGE (f)-[:Sell]-(i)
MERGE (f)-[:Sell]-(j)
MERGE (g)-[:Sell]-(k)
MERGE (h)-[:Sell]-(l)
MERGE (h)-[:Sell]-(m)

Will return:

╒═════════╤═════╤═══════════════════════╕
│"p"      │"qty"│"average_qualification"│
╞═════════╪═════╪═══════════════════════╡
│{"key":6}│3    │3.0                    │
├─────────┼─────┼───────────────────────┤
│{"key":5}│2    │3.5                    │
├─────────┼─────┼───────────────────────┤
│{"key":7}│1    │4.0                    │
└─────────┴─────┴───────────────────────┘

In order to understand this solution, and why it is different than yours, I recommend you to read about the concept of cardinality.

There is no need to count both the [:buy] and the (:Client). Keeping just one of them, and counting it, allows us to finish the first MATCH with a list of products, not a list of clients. The same goes to the second MATCH, where we use avg on the r.qualification, allowing us to maintain the products list and not the recommendations list.

Neo4j querying over 2 relationships and compute an aggregate function on the 2 relationship

Answers (1)

Related Questions