J Thomas
J Thomas

Reputation: 23

Neo4j Movie Tutorial query

I am looking at the Neo4j movie sample project: https://github.com/neo4j-examples/movies-java-spring-data-neo4j-4

One of the examples recommends new co-actors for Tom Hanks i.e. Find actors that Tom Hanks hasn't yet worked with, but his co-actors have.

The Query:

MATCH (tom:Person {name:"Tom Hanks"})-[:ACTED_IN]->(m)<-[:ACTED_IN]-(coActors),
      (coActors)-[:ACTED_IN]->(m2)<-[:ACTED_IN]-(cocoActors)
WHERE NOT (tom)-[:ACTED_IN]->(m2)
RETURN cocoActors.name AS Recommended, count(*) AS Strength ORDER BY Strength DESC

The top 3 results are: Recommended Strength Tom Cruise 5 Zach Grenier 5 Helen Hunt 4

However Helen Hunt is returned in the list of Tom Hanks co-actors:

MATCH (tom:Person {name:"Tom Hanks"})-[:ACTED_IN]->(m)<-[:ACTED_IN]-(coActors) RETURN coActors.name

AND Tom Hanks is returned in the list of Helen Hunt co-actors:

MATCH (tom:Person {name:"Helen Hunt"})-[:ACTED_IN]->(m)<-[:ACTED_IN]-(coActors) RETURN coActors.name 

Is this a bug in Neo4j or in the query given in the tutorial? If it is a bug in the query What is the correct query?

Upvotes: 1

Views: 910

Answers (1)

Nicole White
Nicole White

Reputation: 7790

That query isn't finding people who Tom Hanks hasn't worked with yet. I'm not sure what they were going for there, but to accomplish that you should do:

MATCH (tom:Person)-[:ACTED_IN]->(:Movie)<-[:ACTED_IN]-(coActors:Person),
      (coActors)-[:ACTED_IN]->(:Movie)<-[:ACTED_IN]-(coCoActors:Person)
WHERE tom.name = 'Tom Hanks' AND
      NOT (tom)-[:ACTED_IN]->(:Movie)<-[:ACTED_IN]-(coCoActors)
RETURN coCoActors.name AS recommended, count(*) AS strength 
ORDER BY strength DESC;

The line WHERE NOT (tom)-[:ACTED_IN]->(m2) from their query doesn't really make any sense; all that's doing is asserting that Tom Hanks didn't act in any of the movies that his co-actors and co-co-actors acted in together. It does nothing to assert that Tom Hanks has never acted with the people bound to coCoActors.

Upvotes: 2

Related Questions