Reputation: 127
I am using MySQL 8.0, its "Sakila" database to solve a problem which says--
Find the total number of films for each film rating (G, PG, ...) for each actor.
To give more background, here are the two tables joined and their description:
Table #1: film
+----------------------+---------------------------------------------------------------------+------+-----+-------------------+-----------------------------------------------+
| Field | Type | Null | Key | Default | Extra |
+----------------------+---------------------------------------------------------------------+------+-----+-------------------+-----------------------------------------------+
| film_id | smallint(5) unsigned | NO | PRI | NULL | auto_increment |
| title | varchar(255) | NO | MUL | NULL | |
| description | text | YES | | NULL | |
| release_year | year(4) | YES | | NULL | |
| language_id | tinyint(3) unsigned | NO | MUL | NULL | |
| original_language_id | tinyint(3) unsigned | YES | MUL | NULL | |
| rental_duration | tinyint(3) unsigned | NO | | 3 | |
| rental_rate | decimal(4,2) | NO | | 4.99 | |
| length | smallint(5) unsigned | YES | | NULL | |
| replacement_cost | decimal(5,2) | NO | | 19.99 | |
| rating | enum('G','PG','PG-13','R','NC-17') | YES | | G | |
| special_features | set('Trailers','Commentaries','Deleted Scenes','Behind the Scenes') | YES | | NULL | |
| last_update | timestamp | NO | | CURRENT_TIMESTAMP | DEFAULT_GENERATED on update CURRENT_TIMESTAMP |
+----------------------+---------------------------------------------------------------------+------+-----+-------------------+-----------------------------------------------+
Table #2: film_actor
+-------------+----------------------+------+-----+-------------------+-----------------------------------------------+
| Field | Type | Null | Key | Default | Extra |
+-------------+----------------------+------+-----+-------------------+-----------------------------------------------+
| actor_id | smallint(5) unsigned | NO | PRI | NULL | |
| film_id | smallint(5) unsigned | NO | PRI | NULL | |
| last_update | timestamp | NO | | CURRENT_TIMESTAMP | DEFAULT_GENERATED on update CURRENT_TIMESTAMP |
+-------------+----------------------+------+-----+-------------------+-----------------------------------------------+
From the question asked, what I understood is that the query should return count
of all the films from the film_actor
table, as it mentioned the rating
column which is only present in the same table. On that basis, I used the left join
but the author solved it with inner join
. Could someone please clarify why inner join
is correct and not left join
.
Here is my solution:
select count(f.film_id), f.rating, fa.actor_id
from film f
left join film_actor fa
on f.film_id = fa.film_id
group by f.rating, fa.actor_id;
Solution give by the author:
SELECT fa.actor_id, f.rating, count(*)
FROM film_actor fa
INNER JOIN film f
ON fa.film_id = f.film_id
GROUP BY fa.actor_id, f.rating
Request to please point out the gap in my understanding and also, please guide me to come up with the correct join to use in such word/scenario based problems.
Upvotes: 0
Views: 54
Reputation: 1269873
This is more a question of English than of databases. But the question is implicitly saying: "Find the total number of films for each rating for each actor in those films". That is why an inner join is appropriate.
In practice, it is reasonable to assume that all actors are in a film. This may not be true, but it is a reasonable assumption and if true LEFT JOIN
and INNER JOIN
would be the same.
I suspect you are thinking of something a little difference, which would be all ratings for all actors, with the count -- even if there are no films. That would be phrased slightly differently. More importantly, a LEFT JOIN
would not answer this question correctly.
Upvotes: 2