Reputation: 19
I have these 4 tables in MySql:
User:
Movie:
Collection (many-to-many relation between user and movie):
Review:
So basically what I want is to pull data from user, review and collection and display some sort of user statistics, like the number of reviews each user left, and the number of movies each user saved to their collection. I thought it was just going to be a simple join between 3 tables but apparently I was wrong.
I thought using the COUNT() aggregate function would be the best idea and so I started by making separate queries for user-review and user-collection, and that works just fine, the result for both queries is correct.
SELECT u.userID, username, COUNT(movieID) AS collection_size
FROM collection c INNER JOIN USER u ON u.userID=c.userID
GROUP BY c.userID
userID username collection_size
7 user 2
8 user03 6
And
SELECT u.userID, username, COUNT(movieID) AS review_count
FROM review r INNER JOIN USER u ON u.userID=r.userID
GROUP BY r.userID
userID username review_count
7 user 1
8 user03 4
10 user05 1
The problem arises when trying to join all three tables. The closest I've come to solving the issues is this:
SELECT u.userID, username, COUNT(DISTINCT c.movieID) AS collection_size, COUNT(DISTINCT r.movieID) AS review_count
FROM collection c INNER JOIN USER u ON u.userID=c.userID
INNER JOIN review r ON r.userID=u.userID
GROUP BY u.userID
userID username collection_size review_count
7 user 2 1
8 user03 6 4
The result is almost correct, however as you can see the user with the userID of 10 is missing, even though he has left one review. This query seems to pick up only users which have left at least one review AND have at least one movie in their collection. I've verified this by adding a movie to user 10's collection. He then properly shows up in the result. How do I change the query to show users who have left at least one review OR have at least one movie in their collection?
Basically I want the result to be this:
userID username collection_size review_count
7 user 2 1
8 user03 6 4
10 user05 0 1
Upvotes: 0
Views: 54
Reputation: 162
SELECT
a.userid
,b.qty_movies
,c.qty_reviews
FROM
user a
LEFT OUTER JOIN (
SELECT
userid
,COUNT(*) qty_movies
FROM
collection
GROUP BY 1
) b ON (a.userid = b.userid)
LEFT OUTER JOIN (
SELECT
userid
,COUNT(*) qty_reviews
FROM
review
GROUP BY 1
) c ON (a.userid = c.userid)
;
https://www.db-fiddle.com/f/fiiwTyoFEuspGyscAh23bG/1
Upvotes: 0
Reputation: 222422
It looks like this user does not have a collection (it is not returned by your first query). You can use LEFT JOIN
s instead, starting from the user table:
SELECT u.userID, u.username,
COUNT(DISTINCT c.movieID) AS collection_size,
COUNT(DISTINCT r.movieID) AS review_count
FROM usr u
LEFT JOIN collection c ON u.userID=c.userID
LEFT JOIN review r ON r.userID=u.userID
GROUP BY u.userID
Although this works, it is not efficient. The joins multiply the rows on both sides, only to then count distinct primary key values. I think that the query would be more efficiently phrased with two subqueries for the counts:
select u.userid, u.username,
(select count(*) from collection c where c.userid = u.userid) as collection_size,
(select count(*) from review r where r.userid = u.userid) as review_count
from usr u
Upvotes: 2