guifreballester
guifreballester

Reputation: 47

Using Cypher to find most popular films with low rating. Merge query

I'm currently trying to create a query using the neo4j dataset example of movies. I want to select the movies with a lot of tags (more popular) along with the one that have less ratings.

I have figure out how to find the most popular (more than the average):

MATCH (m:MOVIE)-[r:HAS_TAG]->() 
WITH count(r) as tagnum, m.title as title_now
WITH avg(tagnum) as avgtagnum
MATCH (m:MOVIE)-[r:HAS_TAG]->() 
WITH count(r) as tagnum, m.title as title_now, avgtagnum
WHERE tagnum > avgtagnum*3
RETURN title_now, avgtagnum, tagnum 
ORDER BY  tagnum DESC

And also how to get the ones with less note

MATCH ()-[r:RATED]->(m:MOVIE) 
WITH  avg(r.rating) as avgrating 
MATCH ()-[r:RATED]->(m:MOVIE) 
WITH m.title as titlemovie, avgrating, avg(r.rating) as  movierating 
WHERE movierating < avgrating/2
RETURN titlemovie, avgrating, movierating 
ORDER BY movierating ASC

BUT when I merge them together, the query never ends.

MATCH ()-[r:RATED]->(m:MOVIE) 
WITH  avg(r.rating) as avgrating 
MATCH ()-[r:RATED]->(m:MOVIE) 
WITH m.title as titlemovie, avgrating, avg(r.rating) as  movierating 
WHERE movierating < avgrating/2
MATCH (m:MOVIE)-[r:HAS_TAG]->() 
WITH count(r) as tagnum, m.title as title_now, titlemovie, avgrating, movierating 
WITH avg(tagnum) as avgtagnum, titlemovie, avgrating, movierating 
MATCH (m:MOVIE)-[r:HAS_TAG]->() 
WITH count(r) as tagnum, m.title as title_now, avgtagnum, titlemovie, avgrating, movierating 
WHERE tagnum > avgtagnum*3
RETURN titlemovie, avgrating, movierating, avgtagnum, tagnum 
ORDER BY tagnum DESC,  movierating ASC

The question is how do I found the movies that follow my 2 requirements; less note and more popular?

Upvotes: 1

Views: 323

Answers (1)

Qiaorui Xiang
Qiaorui Xiang

Reputation: 36

  1. First, using avg(rating) will give you average rating not average rating by movie.

2. MATCH (m:MOVIE)-[r:HAS_TAG]->()
WITH count(r) as tagnum, ... WITH avg(tagnum) as avgtagnum ...
This code will not give you exact average number of tag due to pattern matching only take into account those movies which have some tags (So you will get bigger average number than actually is).

  1. Why using WITH ... m.title as title_now, titlemovie two title? Best guess is you match wrong movie

If have to correct you query, I would write as this:


MATCH ()-[r:RATED]->(m:MOVIE) 
WITH  m, avg(r.rating) as avgra
with avg(avgra) as avgrating
MATCH (m:MOVIE) 
WITH avg(size((m)-[:HAS_TAG]-())) as avgtagnum, avgrating
MATCH (m:MOVIE)-[r:HAS_TAG]->() 
WITH count(r) as tagnum, m.title as titlemovie, avgtagnum, avgrating, avg(r.rating) as movierating
WHERE tagnum > avgtagnum*3 and movierating < avgrating/2
RETURN titlemovie, avgrating, movierating, avgtagnum, tagnum 
ORDER BY tagnum DESC,  movierating ASC

Upvotes: 2

Related Questions