Callombert
Callombert

Reputation: 1099

Joining two similar queries on different tables- MySQL OR

I'm having trouble with a MySQL query (I'm a newbie so go easy on me!). I have a website with TV shows and want to select episodes aired between two dates (no problem) based on two different actions (for a custom schedule). Basically, I want to show the episode if the user is either following (uses_follow_shows) or watching the show (user_watched) so I thought I would have (query_on_follow OR query_on_watch) like below but that doesn't work (I'm getting 1200 rows instead of 3, it's fetching every existing row from user_watched).

I thought this statement would do it but it doesn't and I don't understand why

SELECT * FROM show_episode_airdate, show_episode, show_network, network, shows, users_follow_shows, user_watched
                                              WHERE show_episode_airdate.airdate BETWEEN '2013-07-20' AND '2013-07-27' 
                                              AND ( show_episode.episode_id = show_episode_airdate.episode_id 
                                              AND shows.id = show_network.show_id
                                              AND show_network.network_id = network.network_id 
                                              AND show_episode.imdb_id_show = shows.imdb_id 
                                              AND users_follow_shows.user_id = 2 AND shows.id = users_follow_shows.show_id  

                                              OR user_watched.user_id = 2 
                                              AND shows.id = user_watched.show_id 
                                              AND show_episode.episode_id = show_episode_airdate.episode_id 
                                              AND shows.id = show_network.show_id
                                              AND show_network.network_id = network.network_id 
                                              AND show_episode.imdb_id_show = shows.imdb_id )
                                              ORDER by network.network_id ASC

Can anybody tell me what's wrong ?

Thanks!

Edit:

SELECT *
 FROM show_episode_airdate join
 show_episode 
 on show_episode.episode_id = show_episode_airdate.episode_id join
 shows
 on shows.imdb_id = show_episode.imdb_id_show join
 show_network
 on show_network.show_id = shows.id join
 network
 on show_network.network_id = network.network_id join
 users_follow_shows
 on shows.id = users_follow_shows.show_id join
 user_watched
 on shows.id = user_watched.show_id 
WHERE show_episode_airdate.airdate BETWEEN '2013-07-20' AND '2013-07-27' and
  (users_follow_shows.user_id = 2  or
   user_watched.user_id = 2
  )
 ORDER by network.network_id ASC;

Upvotes: 3

Views: 113

Answers (3)

Gordon Linoff
Gordon Linoff

Reputation: 1269623

If you write this with proper join syntax, then the logic problems go away. The result looks something like this:

SELECT *
FROM show_episode_airdate join
     show_episode 
     on show_episode.episode_id = show_episode_airdate.episode_id join
     show_network 
     on show_network.show_id = network.show_id join     <------- THIS IS NOT IN YOUR ORIGINAL LIST
     network
     on show_network.network_id = network.network_id join
     user_follows_shows
     on shows.id = users_follow_shows.show_id join
     user_watched
     on shows.id = user_watched.show_id 
WHERE show_episode_airdate.airdate BETWEEN '2013-07-20' AND '2013-07-27' and
      (users_follow_shows.user_id = 2  or
       user_watched.user_id = 2
      )
ORDER by network.network_id ASC;

What are the benefits? The query is more understandable, easier to write, easier to understand, and less prone to making horrible mistakes like leaving out a join condition. The SQL engine has an easier time optimizing it, and it should run faster. Use proper join syntax.

As if to emphasize my point, the original query has 7 tables, but only 5 join conditions. This usually means you are doing a Cartesian product between tables. In this case, I believe it is missing the relationship between show_network and shows.

EDIT:

Now at least, the query is rather understandable. The problem is that users may follow some shows and may watch some shows. Your query is producing a cartesian product between these two sets. Worse, if a show has no followers or no watchers, then it will simply drop out.

One way to fix the problem is to split the query in two. Instead, I'll use left outer joins and distinct in the select clause:

SELECT distinct *
 FROM show_episode_airdate join
 show_episode 
 on show_episode.episode_id = show_episode_airdate.episode_id join
 shows
 on shows.imdb_id = show_episode.imdb_id_show join
 show_network
 on show_network.show_id = shows.id join
 network
 on show_network.network_id = network.network_id left outer join
 users_follow_shows
 on shows.id = users_follow_shows.show_id left outer join
 user_watched
 on shows.id = user_watched.show_id 
WHERE show_episode_airdate.airdate BETWEEN '2013-07-20' AND '2013-07-27' and
  (users_follow_shows.user_id = 2  or
   user_watched.user_id = 2
  )
 ORDER by network.network_id ASC;

The * should be replaced by the columns you actually want -- and the distinct will only work if you avoid columns from users_follow_shows and user_watched. The left outer join means that all shows will be considered.

An alternative way of writing this -- that gets rid of the need for distinct in the select clause -- is to write a subquery to combine the watched and followed shows. This removes the duplicates, and looks like:

SELECT *
FROM show_episode_airdate join
     show_episode 
     on show_episode.episode_id = show_episode_airdate.episode_id join
     shows
     on shows.imdb_id = show_episode.imdb_id_show join
     show_network
     on show_network.show_id = shows.id join
     network
     on show_network.network_id = network.network_id join
     (select show_id, user_id
      from users_follow_shows
      union
      select show_id, user_id
      from user_watched
     ) watch_or_follow
     on shows.id = watch_or_follow.show_id
WHERE show_episode_airdate.airdate BETWEEN '2013-07-20' AND '2013-07-27' and
      watch_or_follow.user_id = 2
ORDER by network.network_id ASC;

Upvotes: 1

Shaikh Farooque
Shaikh Farooque

Reputation: 2630

It is due to the OR condition you used in the where clause. Give proper paranthesis for combine conditions.

Just try the following query.

SELECT * FROM show_episode_airdate, show_episode, show_network, network, shows, users_follow_shows, user_watched
WHERE show_episode_airdate.airdate BETWEEN '2013-07-20' AND '2013-07-27' 
 AND (( show_episode.episode_id = show_episode_airdate.episode_id 
 AND shows.id = show_network.show_id
 AND show_network.network_id = network.network_id 
 AND show_episode.imdb_id_show = shows.imdb_id 
 AND users_follow_shows.user_id = 2 AND shows.id = users_follow_shows.show_id  )
OR (user_watched.user_id = 2 
AND shows.id = user_watched.show_id 
AND show_episode.episode_id = show_episode_airdate.episode_id 
AND shows.id = show_network.show_id
AND show_network.network_id = network.network_id 
AND show_episode.imdb_id_show = shows.imdb_id ))

ORDER by network.network_id ASC

Upvotes: 0

mirkobrankovic
mirkobrankovic

Reputation: 2347

you need to put () for OR conditions, for example if that is your goal:

SELECT * FROM show_episode_airdate, show_episode, show_network, network, shows, users_follow_shows, user_watched
WHERE show_episode_airdate.airdate BETWEEN '2013-07-20' AND '2013-07-27' 
AND ( (show_episode.episode_id = show_episode_airdate.episode_id 
AND shows.id = show_network.show_id
AND show_network.network_id = network.network_id 
AND show_episode.imdb_id_show = shows.imdb_id 
AND users_follow_shows.user_id = 2 
AND shows.id = users_follow_shows.show_id)
OR (user_watched.user_id = 2 
AND shows.id = user_watched.show_id 
AND show_episode.episode_id = show_episode_airdate.episode_id 
AND shows.id = show_network.show_id
AND show_network.network_id = network.network_id 
AND show_episode.imdb_id_show = shows.imdb_id ) )
ORDER by network.network_id ASC

so you will get: IF BETWEEN dates AND first (conditions) OR second (conditions)

I suggest you use JOIN/INNER JOIN, I got lost in all of those ANDs :D

Upvotes: 0

Related Questions