Reputation: 1533
The title is a bit of a mess, but I have an image which accurately describes what I'm trying to achieve.
As an example, I'm working in neo4j's sandbox, with the Movies-dataset.
The task: Find all actors that have worked on a movie with Tom Hanks more than twice (3 or more times).
Here's the query (and result) which shows all actors that have worked with him at all, and the movies they've been part of.
MATCH (p1:Person {name: "Tom Hanks"})-->(m:Movie)<--(p2:Person)
RETURN m, p2
To save you some time, the only person who has worked with Tom at least thrice (3 times) is Meg Ryan to the right.
So, as a freshman at this query language, my immediate thought was to try the following Cypher-query:
MATCH (p1:Person {name: "Tom Hanks"})-->(m:Movie)<--(p2:Person)
WHERE count(m) > 2
RETURN p2, m
This gave an error, telling me that I can't put the count
-function there.
I've also tried using the WITH
-keyword:
MATCH (p1:Person {name: "Tom Hanks"})-->(m:Movie)<--(p2:Person)
WITH p1, m, p2, count(m) AS common_movie_count
WHERE common_movie_count > 2
RETURN DISTINCT p2, m
...but that didn't help me much, and although it did run, it gave me an empty output (meaning no matches).
For some reason I was allowed to get the names of relevant actors (and the count), as long as I accept getting ALL actors, in table format.
MATCH (p1:Person {name: "Tom Hanks"})-->(m:Movie)<--(p2:Person)
RETURN DISTINCT p2.name AS name, count(m) as common_movie_count
ORDER BY common_movie_count DESC
This query returned the following table (or started with these three results): | name | common_movie_count | | -------------- | ------------------ | | "Meg Ryan" | 3 | | "Ron Howard" | 2 | | "Gary Sinise" | 2 |
I want the nodes, not just the name. Also: I only want the nodes that are relevant (count > 3
), as in the dataset I intend to apply this, there will be too many non-relevant nodes to apply this to them all.
Do you have any ideas or simple solutions to my problem? To me the problems seems so simple that I can't be the first one to run into it, but I can't seem to google my way to a good solution.
Upvotes: 1
Views: 104
Reputation: 5385
You were pretty close here:
MATCH (p1:Person {name: "Tom Hanks"})-->(m:Movie)<--(p2:Person)
WITH p1, m, p2, count(m) AS common_movie_count
WHERE common_movie_count > 2
RETURN DISTINCT p2, m
except that the m
in the WITH
statement makes common_movie_count
always return 0
If you do this:
MATCH (p1:Person {name: "Tom Hanks"})-->(m:Movie)<--(p2:Person)
WITH p1, p2, COLLECT( DISTINCT m) AS common_movies
WHERE SIZE(common_movies) > 2
UNWIND common_movies AS m
RETURN p2, m
you would get the expected result
Upvotes: 2