David D
David D

Reputation: 1545

Generating N recommendations per person in Neo4J

I follow this tutorial about collaborative filters in Neo4j. In this tutorial, we first create a toy movie graph, as follows:

LOAD CSV WITH HEADERS FROM "https://neo4j-contrib.github.io/developer-resources/cypher/movies_actors.csv" AS line
WITH line
WHERE line.job = "ACTED_IN"
MERGE (m:Movie {title:line.title}) ON CREATE SET m.released = toInt(line.released), m.tagline = line.tagline
MERGE (p:Person {name:line.name}) ON CREATE SET p.born = toInt(line.born)
MERGE (p)-[:ACTED_IN {roles:split(line.roles,";")}]->(m)
RETURN count(*);

Next, we propose five possible co-actors for Tom Hanks:

MATCH (tom:Person)-[:ACTED_IN]->(movie1)<-[:ACTED_IN]-(coActor:Person),
       (coActor)-[:ACTED_IN]->(movie2)<-[:ACTED_IN]-(coCoActor:Person)
WHERE tom.name = "Tom Hanks"
AND   NOT    (tom)-[:ACTED_IN]->()<-[:ACTED_IN]-(coCoActor)
RETURN coCoActor.name, count(distinct coCoActor) as frequency
ORDER BY frequency DESC
LIMIT 5

What if I want to perform such an operation on every person who acted in "Apollo 13"? In other words, my task is to propose 5 possible co-actors to every person who acted in "Apollo 13". How do I do this in an effective way?

Upvotes: 0

Views: 64

Answers (1)

Nicole White
Nicole White

Reputation: 7790

A few things here. The query you pasted doesn't really make any sense:

RETURN coCoActor.name, COUNT(DISTINCT coCoActor) AS frequency

This will always return a frequency of 1, so your ORDER BY is meaningless.

I think you meant this:

RETURN coCoActor.name, COUNT(DISTINCT coActor) AS frequency

Second thing is that you don't need the variables movie1 and movie2; they're not used again in your query.

Finally, you need to assert that you're not recommending the same actor to him or herself:

WHERE actor <> coCoActor

To actually answer your question:

// Find the Apollo 13 actors.
MATCH (actor:Person)-[:ACTED_IN]->(:Movie {title:"Apollo 13"})

// Continue with query.
MATCH (actor)-[:ACTED_IN]->(:Movie)<-[:ACTED_IN]-(coActor:Person),
      (coActor)-[:ACTED_IN]->(:Movie)<-[:ACTED_IN]-(coCoActor:Person)
WHERE NOT (actor)-[:ACTED_IN]->(:Movie)<-[:ACTED_IN]-(coCoActor) AND
      actor <> coCoActor

// Group by actor and coCoActor, counting how many coActors they share as freq.
WITH actor, coCoActor, COUNT(DISTINCT coActor) AS freq

// Order by freq descending so that COLLECT()[..5] grabs the top 5 per row.
ORDER BY freq DESC

// Get the recommendations.
WITH actor, COLLECT({name: coCoActor.name, freq: freq})[..5] AS recos
RETURN actor.name, recos;

Upvotes: 2

Related Questions