Reputation: 5385
I have a set of documents , each belonging to 1 or several groups.
Persons can recommend these documents
I need for each group the documents that have position 1 to N in terms of number of received recommendations, meaning that when this is an intermediary result and N=3
Document Recommendations
a 7
b 4
c 4
d 3
it would return the a,b and c
With this:
Document Recommendations
a 6
b 5
c 4
d 4
it would return the a,b,c and d
and with this
With this:
Document Recommendations
a 6
b 4
c 4
d 4
e 3
it would return a,b,c,and d
How do I do this kind of stuff in Cypher? I've gotten this far (was planning to put a link to the console but it doesn't seem to work)
//Groups
create (gr1:Group {name:"First group"})
create (gr2:Group {name:"Second group"})
//Persons
create (p1:Person {name:"Jan"})
create (p2:Person {name:"Marie"})
create (p3:Person {name:"Willem"})
create (p4:Person {name:"Simone"})
create (p5:Person {name:"Henk"})
create (p6:Person {name:"Ilse"})
create (p7:Person {name:"Tom"})
create (p8:Person {name:"Detlef"})
//Content
create (ci1:Contentitem { title: "Sometitle 1"})
create (ci2:Contentitem { title: "Sometitle 2"})
create (ci3:Contentitem { title: "Sometitle 3"})
create (ci4:Contentitem { title: "Sometitle 4"})
create (ci5:Contentitem { title: "Sometitle 5"})
create (ci6:Contentitem { title: "Sometitle 6"})
create (ci7:Contentitem { title: "Sometitle 7"})
create (ci8:Contentitem { title: "Sometitle 8"})
create (ci9:Contentitem { title: "Sometitle 9"})
create (ci10:Contentitem { title: "Sometitle 10"})
//Recommendations
create (ci1)-[:IsRecommendedBy]->(p4)
create (ci8)-[:IsRecommendedBy]->(p8)
create (ci1)-[:IsRecommendedBy]->(p1)
create (ci8)-[:IsRecommendedBy]->(p7)
create (ci5)-[:IsRecommendedBy]->(p6)
create (ci1)-[:IsRecommendedBy]->(p3)
create (ci8)-[:IsRecommendedBy]->(p3)
create (ci5)-[:IsRecommendedBy]->(p4)
create (ci8)-[:IsRecommendedBy]->(p5)
create (ci5)-[:IsRecommendedBy]->(p2)
create (ci5)-[:IsRecommendedBy]->(p1)
create (ci5)-[:IsRecommendedBy]->(p8)
create (ci2)-[:IsRecommendedBy]->(p1)
create (ci2)-[:IsRecommendedBy]->(p3)
create (ci2)-[:IsRecommendedBy]->(p7)
create (ci10)-[:IsRecommendedBy]->(p8)
create (ci3)-[:IsRecommendedBy]->(p4)
create (ci10)-[:IsRecommendedBy]->(p5)
create (ci3)-[:IsRecommendedBy]->(p1)
create (ci4)-[:IsRecommendedBy]->(p5)
create (ci4)-[:IsRecommendedBy]->(p8)
create (ci6)-[:IsRecommendedBy]->(p5)
create (ci9)-[:IsRecommendedBy]->(p1)
create (ci9)-[:IsRecommendedBy]->(p2)
create (ci6)-[:IsRecommendedBy]->(p6)
create (ci6)-[:IsRecommendedBy]->(p8)
//Group membership
create (ci1)-[:BelongsToGroup]->(gr1)
create (ci1)-[:BelongsToGroup]->(gr2)
create (ci2)-[:BelongsToGroup]->(gr1)
create (ci3)-[:BelongsToGroup]->(gr1)
create (ci4)-[:BelongsToGroup]->(gr1)
create (ci4)-[:BelongsToGroup]->(gr2)
create (ci5)-[:BelongsToGroup]->(gr1)
create (ci6)-[:BelongsToGroup]->(gr1)
create (ci7)-[:BelongsToGroup]->(gr1)
create (ci8)-[:BelongsToGroup]->(gr1)
create (ci8)-[:BelongsToGroup]->(gr2)
create (ci10)-[:BelongsToGroup]->(gr1)
create (ci10)-[:BelongsToGroup]->(gr2)
;
and the query
match (gr)<-[:BelongsToGroup]-(ci:Contentitem)-[:IsRecommendedBy]->(p:Person)
return gr.name,ci.title,count(p) as Recommendations
order by gr.name, Recommendations desc
which returns
gr.name ci.title Recommendations
----------------------------------------------------
First group Sometitle 5 5
First group Sometitle 8 4
First group Sometitle 1 3
First group Sometitle 6 3
First group Sometitle 2 3
First group Sometitle 4 2
First group Sometitle 3 2
First group Sometitle 10 2
Second group Sometitle 8 4
Second group Sometitle 1 3
Second group Sometitle 10 2
Second group Sometitle 4 2
with N=3, the final result should be
gr.name ci.title Recommendations
----------------------------------------------------
First group Sometitle 5 5
First group Sometitle 8 4
First group Sometitle 1 3
First group Sometitle 6 3
First group Sometitle 2 3
Second group Sometitle 8 4
Second group Sometitle 1 3
Second group Sometitle 10 2
Second group Sometitle 4 2
with N=2, the final result would be
gr.name ci.title Recommendations
----------------------------------------------------
First group Sometitle 5 5
First group Sometitle 8 4
Second group Sometitle 8 4
Second group Sometitle 1 3
What is important in the ex-aequo rule is that the group with the lowest number of recommendations that starts at a position <=N , is not cut off at an arbitrairy position, but fully included. So, when we have the case before cutoff like this
gr.name ci.title Recommendations
----------------------------------------------------
First group Sometitle 5 5
First group Sometitle 8 4
First group Sometitle 1 4
First group Sometitle 6 3
First group Sometitle 2 3
First group Sometitle 4 2
First group Sometitle 3 2
First group Sometitle 10 2
Second group Sometitle 8 4
Second group Sometitle 1 4
Second group Sometitle 10 2
Second group Sometitle 4 2
Second group Sometitle 7 1
the end result for N=3 would be:
gr.name ci.title Recommendations
----------------------------------------------------
First group Sometitle 5 5
First group Sometitle 8 4
First group Sometitle 1 4
Second group Sometitle 8 4
Second group Sometitle 1 4
Second group Sometitle 10 2
Second group Sometitle 4 2
and for N=2
gr.name ci.title Recommendations
----------------------------------------------------
First group Sometitle 5 5
First group Sometitle 8 4
First group Sometitle 1 4
Second group Sometitle 8 4
Second group Sometitle 1 4
Upvotes: 0
Views: 159
Reputation: 9952
You got a good answer already (+1) but I was curious if it could be done without the second match. Here's what I came up with
MATCH (gr)<-[:BelongsToGroup]-(ci:Contentitem)-[r:IsRecommendedBy]->() // I dropped (p:Person) since it's not really relevant, and counted the [:IsRecommendedBy] instead
WITH gr, [ci, count(r)] AS document
ORDER BY document[1] desc
WITH gr, collect(document) as documents, collect(document[1]) as recommendations
WITH gr, documents, recommendations,
CASE WHEN length(recommendations) >= {n} THEN {n}-1 ELSE length(recommendations)-1 END as ix
RETURN gr.name AS group,[doc IN documents
WHERE doc[1]>= recommendations[ix]| [(doc[0]).title, doc[1]]] AS documents
I don't know if this is necessarily any better, but it's different, with one match, CASE WHEN
instead of REDUCE/COALESCE
to avoid out of index problem, and returning one row per group with an ordered collection of [document,recommendations] pairs; maybe there's something usable there.
Upvotes: 2
Reputation: 33155
I don't think there's a clean way to do exactly what you want, but here's my best try. It requires a second match after you figure out the cutoff.
Something like this:
match (gr)<-[:BelongsToGroup]-(ci:Contentitem)-[:IsRecommendedBy]->(p:Person)
with gr, ci, count(p) as recommendations
order by recommendations desc
with gr, collect(recommendations) as cutoffs
// coalesce here to avoid null problems if you don't have N=3 distinct recommendations
with gr, coalesce(cutoffs[2], cutoffs[1], cutoffs[0]) as cutoff
match (gr)<-[:BelongsToGroup]-(ci:Contentitem)-[:IsRecommendedBy]->(p:Person)
with gr, ci, count(p) as recommendations, cutoff
where recommendations >= cutoff
return gr.name, ci.title, recommendations, cutoff
order by gr.name, recommendations desc;
gives:
+------------------------------------------------------------+
| gr.name | ci.title | recommendations | cutoff |
+------------------------------------------------------------+
| "First group" | "Sometitle 5" | 5 | 3 |
| "First group" | "Sometitle 8" | 4 | 3 |
| "First group" | "Sometitle 1" | 3 | 3 |
| "First group" | "Sometitle 6" | 3 | 3 |
| "First group" | "Sometitle 2" | 3 | 3 |
| "Second group" | "Sometitle 8" | 4 | 2 |
| "Second group" | "Sometitle 1" | 3 | 2 |
| "Second group" | "Sometitle 4" | 2 | 2 |
| "Second group" | "Sometitle 10" | 2 | 2 |
+------------------------------------------------------------+
9 rows
update: It occurred to me that you'd probably want to pass in N
instead of have it coded like this with coalesce. In that case, you could do:
with gr, reduce(acc=cutoffs[0], x in range(0, {N}-1)| coalesce(cutoffs[x], acc)) as cutoff
This will go through the range 0 to N-1 without the need to hard code it like the first solution.
Upvotes: 3