Amit
Amit

Reputation: 3990

Getting all movies belonging to a set of tags

So I have a Neo4J db of movies along with their genres. I'd like to find all movies that belong to all of the genres in a set.

EG:

Matrix - Sci-fi, Thriller, Action
Harry Potter - Drama, Fiction, Thriller
Pulp Fiction - Drama, Thriller

Now, what I want is movies belonging to Drama, as well as Thriller. This means Harry, and Pulp, but not Matrix, even though it also belongs to Thriller.

Any ideas on the query?

Upvotes: 0

Views: 270

Answers (3)

Dave Bennett
Dave Bennett

Reputation: 11216

Consider the following data set:

Create (g1:Genre {name: 'Sci-fi' } )
Create (g2:Genre {name: 'Thriller' } )
Create (g3:Genre {name: 'Action' } )
Create (g4:Genre {name: 'Drama' } )
Create (g5:Genre {name: 'Fiction' } )
Create (m1:Movie {name: 'Matrix' } )
Create (m1)-[:IS_GENRE]->(g1)
Create (m1)-[:IS_GENRE]->(g2)
Create (m1)-[:IS_GENRE]->(g3)
Create (m2:Movie {name: 'Harry Potter' } )
Create (m2)-[:IS_GENRE]->(g4)
Create (m2)-[:IS_GENRE]->(g5)
Create (m2)-[:IS_GENRE]->(g2)
Create (m3:Movie {name: 'Pulp Fiction' } )
Create (m3)-[:IS_GENRE]->(g4)
Create (m3)-[:IS_GENRE]->(g2)

If you use the following query you can only match genres in a particular list but then only return results where the number of genres per movie is the same size as your list.

with ['Thriller','Drama'] as genre_list
match (m:Movie)-[r:IS_GENRE]->(g:Genre)
using index g:Genre(name)
where g.name in genre_list
with genre_list,m.name as movie, collect(g.name) as genres
where size(genres) = size(genre_list)
return movie, genres

I think this is pretty interesting if you only have two genres that you are matching on...

match p=allShortestPaths((g1:Genre {name: 'Thriller'} )-[:IS_GENRE*..2]-(g2:Genre {name: 'Drama'} ))
using index g1:Genre(name)
using index g2:Genre(name)
return (nodes(p)[1]).name as movie, [(nodes(p)[0]).name, (nodes(p)[2]).name] as genres

Upvotes: 0

cybersam
cybersam

Reputation: 66999

  1. You don't need a relationship in each direction (contains and genres), as you mentioned in a comment. Just one direction is good enough, since you can easily traverse a relationship in either direction. In this answer, I will just use the genre relationship.

  2. I assume that you first create an index on the name property of the Genre node.

    CREATE INDEX ON :Genre(name);
    

    This index will allow the actual query, below, to quickly get the desired Genre nodes without having to iterate through every such node.

    MATCH (g1:Genre { name: 'Drama' })<-[:genre]-(m:Movie)-[:genre]->(g2:Genre { name: 'Thriller' })
    USING INDEX g1:Genre(name)
    USING INDEX g2:Genre(name)
    RETURN m;
    

    This simple and efficient query forces (via USING INDEX) the Cypher planner to use the above index for both Genre nodes (since the planner currently only does that automatically for one of them).

Upvotes: 1

Martin Preusse
Martin Preusse

Reputation: 9369

You can MATCH your Movie nodes and filter on the relationships two Genre nodes:

MATCH (m:Movie)
WHERE (m)-[:GENRE]->(:Genre {genre_name: 'Drama'})
AND (m)-[:GENRE]->(:Genre {genre_name: 'Thriller'})
RETURN m

Upvotes: 2

Related Questions