tawfiq
tawfiq

Reputation: 43

sql select similar movies based on categories match frequency

hi there i hope you are good guys
i want to select similar movies based on movie categories
so i have this tables in my database

table movies :id,title,......,date
table movie_categories : id,id_movie,id_category
table categories : id,category

where category is simply : action,drama,.....
i'm little noob noob in sql queries so i hope someone can help me
and i've tried this so far

SELECT m.*,COUNT(mc.*) AS cat_frequency 
FROM movies m,movie_categories mc 
WHERE mc.id_category IN (SELECT c.id_category FROM movie_categories c WHERE id_movie = 1)
ORDER BY cat_frequency 

where 1 is the id of the movie that user currently viewing

the expected result is all movie data and with categories match frequency
for exemple let's say current watched movie have categories : action,fantastic,drama
so the movies that have the same categories (100% : have same categories) will be first in order , and then the movies that (66.66% : only 2 categories are matched) and so on ....

note : i can do it in php but i want to do it with sql

Upvotes: 1

Views: 826

Answers (2)

Juan Carlos Oropeza
Juan Carlos Oropeza

Reputation: 48177

First you need to know the current movie categories

 SELECT c.id_category 
 FROM movie_categories c 
 WHERE id_movie = 1

Then you need check how many categories match each movies

SELECT m.id, COUNT(*) AS cat_frequency 
FROM movies m
JOIN movie_categories mc ON m.id = mc.id_movie   
WHERE mc.id_category IN ( SELECT c.id_category 
                          FROM movie_categories c 
                          WHERE c.id_movie = 1 )
GROUP BY m.id

Now you have the matching number of cat, you join both to sort by that atribute

SELECT m.*
FROM movies m
JOIN (
        SELECT m.id, COUNT(*) AS cat_frequency 
        FROM movies m
        JOIN movie_categories mc ON m.id = mc.id_movie    
        WHERE mc.id_category IN ( SELECT c.id_category 
                                  FROM movie_categories c 
                                  WHERE c.id_movie = 1 )
        GROUP BY m.id
     ) f
  ON m.id = f.movie_id
WHERE m.id <> 1 -- you probably dont want show the current movie.
ORDER BY f.cat_frequency DESC

Upvotes: 2

Barbaros &#214;zhan
Barbaros &#214;zhan

Reputation: 65218

I suppose you need such a query :

SELECT m.title, c.id_category,COUNT(*) AS cat_frequency 
  FROM movies m
  JOIN movie_categories mc ON m.id = mc.id_movie
  JOIN categories c ON c.id_category = mc.id_category
 GROUP BY c.id_category
 ORDER BY cat_frequency DESC;

Upvotes: 0

Related Questions