Reputation: 669
I've been trying to writhe the following task in cypher query but I am not getting the right results. Other stackoverflow questions discuss limit or collect but I do not think that is enough to do the following task.
Task: I have (p:Product) nodes and between two product nodes there is a relationship called "BOUGHT_TOGETHER". That is
(p:Product)-[b:BOUGHT_TOGETHER]-(q:Product)
And the relationship b has a property called "size" which contains some number. I want to return top 3 results for each product relationship which is ordered by the size. For instance, the query result should look like the following.
+------------------------+
| p.id | q.id | b.size |
+------------------------+
1 2 10
1 3 8
1 5 7
2 21 34
2 17 20
2 35 15
3 5 49
3 333 30
3 65 5
. . .
. . .
. . .
Can someone show me how to write a cypher query in order to achieve the desired results? Thank you!
Upvotes: 5
Views: 6326
Reputation: 20175
Another solution is to first order the relationships, pipe them in a collection and UNWIND only the 3 first results of the collection :
MATCH (p:Product)-[r:BOUGHT_TOGETHER]->(:Product)
WITH p, r
ORDER BY r.size DESC
WITH p, collect(r) AS bts
UNWIND bts[0..3] AS r
RETURN p.uuid as pid, endNode(r).uuid as qid, r.size as size
Test console here : http://console.neo4j.org/r/r88ijn
NB: After re-reading jjaderberg's answer this is a bit similar, just I think more readable. Why I voted for his answer.
Upvotes: 8
Reputation: 45003
Cypher has LIMIT and ORDER statements.
http://neo4j.com/docs/stable/query-limit.html
http://neo4j.com/docs/stable/query-order.html
MATCH (p:Product)-[b:BOUGHT_TOGETHER]-(q:Product)
RETURN p.id, q.id, b.size
ORDER BY b.size DESC
LIMIT 3;
Upvotes: 3
Reputation: 9952
Here's one way to do it (it seems there should be a way to use LIMIT
, but I couldn't come up with one just now).
I generated an example graph with
FOREACH (a IN [[1,2,10],[1,3,8],[1,5,7],[2,21,34],[2,17,20],[2,35,15],[3,5,49],[3,333,30],[3,65,5],[1,4,1],[3,6,100]]| MERGE (p:Product { id:a[0]})
MERGE (q:Product { id:a[1]})
CREATE p-[b:BOUGHT_TOGETHER { size:a[2]}]->q
)
This is the data from your table of desired output, plus two additional items: [1,4,1]
and [3,5,100]
. Having more than three relationships for some nodes helps us test that the query gets the correct three–the results for 1
should not contain [1,4,1]
and the result for 3
should now contain [3,6,100]
instead of [3,5,5]
.
If this is an accurate sample of your data, then this query should do what you want:
MATCH (p:Product)-[b:BOUGHT_TOGETHER]-(q:Product)
WITH p.id AS pid, q.id AS qid, b.size AS bsize
ORDER BY bsize DESC
WITH pid, collect([qid, bsize])[..3] AS qb
UNWIND qb AS uqb
RETURN pid, uqb[0] AS qid, uqb[1] AS bsize
ORDER BY pid, bsize DESC
The idea is to order all the result items by b.size
, then collect them per p
and throw away all but the first three items in each collection, then unwind and return. The results will not look exactly like your output table because it includes the relationships in the other direction as well ([5,1,7]
as well as [1,5,7]
) but I think that's what you would want anyway.
If this works, you might want to see if you can defer reading off properties until after you have trimmed the collections to save some database hits.
Upvotes: 3