Reputation: 1099
I'm having trouble with a MySQL query (I'm a newbie so go easy on me!). I have a website with TV shows and want to select episodes aired between two dates (no problem) based on two different actions (for a custom schedule). Basically, I want to show the episode if the user is either following (uses_follow_shows) or watching the show (user_watched) so I thought I would have (query_on_follow OR query_on_watch) like below but that doesn't work (I'm getting 1200 rows instead of 3, it's fetching every existing row from user_watched).
I thought this statement would do it but it doesn't and I don't understand why
SELECT * FROM show_episode_airdate, show_episode, show_network, network, shows, users_follow_shows, user_watched
WHERE show_episode_airdate.airdate BETWEEN '2013-07-20' AND '2013-07-27'
AND ( show_episode.episode_id = show_episode_airdate.episode_id
AND shows.id = show_network.show_id
AND show_network.network_id = network.network_id
AND show_episode.imdb_id_show = shows.imdb_id
AND users_follow_shows.user_id = 2 AND shows.id = users_follow_shows.show_id
OR user_watched.user_id = 2
AND shows.id = user_watched.show_id
AND show_episode.episode_id = show_episode_airdate.episode_id
AND shows.id = show_network.show_id
AND show_network.network_id = network.network_id
AND show_episode.imdb_id_show = shows.imdb_id )
ORDER by network.network_id ASC
Can anybody tell me what's wrong ?
Thanks!
Edit:
SELECT *
FROM show_episode_airdate join
show_episode
on show_episode.episode_id = show_episode_airdate.episode_id join
shows
on shows.imdb_id = show_episode.imdb_id_show join
show_network
on show_network.show_id = shows.id join
network
on show_network.network_id = network.network_id join
users_follow_shows
on shows.id = users_follow_shows.show_id join
user_watched
on shows.id = user_watched.show_id
WHERE show_episode_airdate.airdate BETWEEN '2013-07-20' AND '2013-07-27' and
(users_follow_shows.user_id = 2 or
user_watched.user_id = 2
)
ORDER by network.network_id ASC;
Upvotes: 3
Views: 113
Reputation: 1269623
If you write this with proper join syntax, then the logic problems go away. The result looks something like this:
SELECT *
FROM show_episode_airdate join
show_episode
on show_episode.episode_id = show_episode_airdate.episode_id join
show_network
on show_network.show_id = network.show_id join <------- THIS IS NOT IN YOUR ORIGINAL LIST
network
on show_network.network_id = network.network_id join
user_follows_shows
on shows.id = users_follow_shows.show_id join
user_watched
on shows.id = user_watched.show_id
WHERE show_episode_airdate.airdate BETWEEN '2013-07-20' AND '2013-07-27' and
(users_follow_shows.user_id = 2 or
user_watched.user_id = 2
)
ORDER by network.network_id ASC;
What are the benefits? The query is more understandable, easier to write, easier to understand, and less prone to making horrible mistakes like leaving out a join condition. The SQL engine has an easier time optimizing it, and it should run faster. Use proper join syntax.
As if to emphasize my point, the original query has 7 tables, but only 5 join conditions. This usually means you are doing a Cartesian product between tables. In this case, I believe it is missing the relationship between show_network
and shows
.
EDIT:
Now at least, the query is rather understandable. The problem is that users may follow some shows and may watch some shows. Your query is producing a cartesian product between these two sets. Worse, if a show has no followers or no watchers, then it will simply drop out.
One way to fix the problem is to split the query in two. Instead, I'll use left outer join
s and distinct
in the select
clause:
SELECT distinct *
FROM show_episode_airdate join
show_episode
on show_episode.episode_id = show_episode_airdate.episode_id join
shows
on shows.imdb_id = show_episode.imdb_id_show join
show_network
on show_network.show_id = shows.id join
network
on show_network.network_id = network.network_id left outer join
users_follow_shows
on shows.id = users_follow_shows.show_id left outer join
user_watched
on shows.id = user_watched.show_id
WHERE show_episode_airdate.airdate BETWEEN '2013-07-20' AND '2013-07-27' and
(users_follow_shows.user_id = 2 or
user_watched.user_id = 2
)
ORDER by network.network_id ASC;
The *
should be replaced by the columns you actually want -- and the distinct
will only work if you avoid columns from users_follow_shows
and user_watched
. The left outer join
means that all shows will be considered.
An alternative way of writing this -- that gets rid of the need for distinct
in the select
clause -- is to write a subquery to combine the watched and followed shows. This removes the duplicates, and looks like:
SELECT *
FROM show_episode_airdate join
show_episode
on show_episode.episode_id = show_episode_airdate.episode_id join
shows
on shows.imdb_id = show_episode.imdb_id_show join
show_network
on show_network.show_id = shows.id join
network
on show_network.network_id = network.network_id join
(select show_id, user_id
from users_follow_shows
union
select show_id, user_id
from user_watched
) watch_or_follow
on shows.id = watch_or_follow.show_id
WHERE show_episode_airdate.airdate BETWEEN '2013-07-20' AND '2013-07-27' and
watch_or_follow.user_id = 2
ORDER by network.network_id ASC;
Upvotes: 1
Reputation: 2630
It is due to the OR condition you used in the where clause. Give proper paranthesis for combine conditions.
Just try the following query.
SELECT * FROM show_episode_airdate, show_episode, show_network, network, shows, users_follow_shows, user_watched
WHERE show_episode_airdate.airdate BETWEEN '2013-07-20' AND '2013-07-27'
AND (( show_episode.episode_id = show_episode_airdate.episode_id
AND shows.id = show_network.show_id
AND show_network.network_id = network.network_id
AND show_episode.imdb_id_show = shows.imdb_id
AND users_follow_shows.user_id = 2 AND shows.id = users_follow_shows.show_id )
OR (user_watched.user_id = 2
AND shows.id = user_watched.show_id
AND show_episode.episode_id = show_episode_airdate.episode_id
AND shows.id = show_network.show_id
AND show_network.network_id = network.network_id
AND show_episode.imdb_id_show = shows.imdb_id ))
ORDER by network.network_id ASC
Upvotes: 0
Reputation: 2347
you need to put ()
for OR conditions, for example if that is your goal:
SELECT * FROM show_episode_airdate, show_episode, show_network, network, shows, users_follow_shows, user_watched
WHERE show_episode_airdate.airdate BETWEEN '2013-07-20' AND '2013-07-27'
AND ( (show_episode.episode_id = show_episode_airdate.episode_id
AND shows.id = show_network.show_id
AND show_network.network_id = network.network_id
AND show_episode.imdb_id_show = shows.imdb_id
AND users_follow_shows.user_id = 2
AND shows.id = users_follow_shows.show_id)
OR (user_watched.user_id = 2
AND shows.id = user_watched.show_id
AND show_episode.episode_id = show_episode_airdate.episode_id
AND shows.id = show_network.show_id
AND show_network.network_id = network.network_id
AND show_episode.imdb_id_show = shows.imdb_id ) )
ORDER by network.network_id ASC
so you will get: IF BETWEEN
dates AND
first (conditions) OR
second (conditions)
I suggest you use JOIN
/INNER JOIN
, I got lost in all of those ANDs :D
Upvotes: 0