Reputation: 1056
I thought I understood how left outer joins work, but I have a situation that is not working, and I'm not 100% sure if the way I have my query structured is incorrect, or if it's a data issue.
For background, I have the following MySQL table structures:
mysql> describe achievement;
+-------------+----------------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------------+----------------------+------+-----+---------+-------+
| id | varchar(64) | NO | PRI | NULL | |
| game_id | varchar(10) | NO | PRI | NULL | |
| name | varchar(64) | NO | | NULL | |
| description | varchar(255) | NO | | NULL | |
| image_url | varchar(255) | NO | | NULL | |
| gamerscore | smallint(5) unsigned | NO | | 0 | |
| hidden | tinyint(1) | NO | | 0 | |
| base_hidden | tinyint(1) | NO | | 0 | |
+-------------+----------------------+------+-----+---------+-------+
8 rows in set (0.00 sec)
and
mysql> describe gamer_achievement;
+----------------+---------------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+----------------+---------------------+------+-----+---------+-------+
| game_id | varchar(10) | NO | PRI | NULL | |
| achievement_id | varchar(64) | NO | PRI | NULL | |
| gamer_id | varchar(36) | NO | PRI | NULL | |
| earned_epoch | bigint(20) unsigned | NO | | 0 | |
| offline | tinyint(1) | NO | | 0 | |
+----------------+---------------------+------+-----+---------+-------+
5 rows in set (0.00 sec)
As for the data, this is what I have populated here (only pertinent columns included for brevity):
+----+------------+------------------------------+
| id | game_id | name |
+----+------------+------------------------------+
| 1 | 1480656849 | Cluster Buster |
| 2 | 1480656849 | Star Gazer |
| 3 | 1480656849 | Flower Child |
| 4 | 1480656849 | Oyster-meister |
| 5 | 1480656849 | Big Cheese of the South Seas |
| 6 | 1480656849 | Hexic Addict |
| 7 | 1480656849 | Collapse Master |
| 8 | 1480656849 | Survivalist |
| 9 | 1480656849 | Tick-Tock Doc |
| 10 | 1480656849 | Marathon Mogul |
| 11 | 1480656849 | Millionaire Extraordinaire |
| 12 | 1480656849 | Grand Pearl Pooh-Bah |
+----+------------+------------------------------+
12 rows in set (0.00 sec)
and
+----------------+------------+--------------+---------+
| achievement_id | game_id | earned_epoch | offline |
+----------------+------------+--------------+---------+
| 1 | 1480656849 | 0 | 1 |
| 2 | 1480656849 | 0 | 1 |
| 3 | 1480656849 | 0 | 1 |
| 4 | 1480656849 | 1149789371 | 0 |
| 7 | 1480656849 | 1149800406 | 0 |
| 8 | 1480656849 | 0 | 1 |
| 9 | 1480656849 | 1149794790 | 0 |
| 10 | 1480656849 | 1149792417 | 0 |
+----------------+------------+--------------+---------+
8 rows in set (0.02 sec)
In this particular case, the achievement
table is the "master" table and will contain the information that I always want to see. The gamer_achievement
table only contains information for achievements that are actually earned. For any particular game for any particular gamer, there can be any number of rows in the gamer_achievement
table - including none if no achievements have been earned for that game. For example, in the sample data above, achievements with ids 5, 6, 11, and 12 have not been earned.
What I currently have written is
select a.id,
a.name,
ga.earned_epoch,
ga.offline
from achievement a
LEFT OUTER JOIN gamer_achievement ga
ON (a.id = ga.achievement_id and a.game_id = ga.game_id)
where ga.gamer_id = 'fba8fcaa-f57b-44c6-9431-4ab78605b024'
and a.game_id = '1480656849'
order by convert (a.id, unsigned)
but this is only returning the full information for those achievements that have actually been earned - the unearned achievement information from the right side table (gamer_achievement
) is not being show with the NULL values as I would expect from this type of query. This is what I am expecting to see:
+----+-------------------------------+--------------+---------+
| id | name | earned_epoch | offline |
+----+-------------------------------+--------------+---------+
| 1 | Cluster Buster | 0 | 1 |
| 2 | Star Gazer | 0 | 1 |
| 3 | Flower Child | 0 | 1 |
| 4 | Oyster-meister | 1149789371 | 0 |
| 5 | Big Cheese of the South Seas | NULL | NULL |
| 6 | Hexic Addict | NULL | NULL |
| 7 | Collapse Master | 1149800406 | 0 |
| 8 | Survivalist | 0 | 1 |
| 9 | Tick-Tock Doc | 1149794790 | 0 |
| 10 | Marathon Mogul | 1149792417 | 0 |
| 11 | Millionaire Extraordinaire | NULL | NULL |
| 12 | Grand Pearl Pooh-Bah | NULL | NULL |
+----+-------------------------------+--------------+---------+
12 rows in set (0.00 sec)
What am I missing here? From what I understand, the basic query LOOKS right to me, but I'm obviously missing some piece of critical information.
Upvotes: 28
Views: 11746
Reputation: 48139
Many have answered, but I'll try too and hopefully lend in some more clarification. How I have always interpreted it (and you can check so many other posts I've responded to with LEFT joins), I try to list the table I want everything from first (left side... hence read from left to right). Then left join to the "Other" table (right side) on whatever the criteria is between them... Then, when doing a left join, and there are additional criteria against the right side table, those conditions would stay with that join condition. By bringing them into the "WHERE" clause would imply an INNER JOIN (must always match) which is not what you want... I also try to always show the left table alias.field = right table alias.field to keep the correlation clear... Then, apply the where clause to the basis criteria you want from the first table.. something like
select
a.id,
a.name,
ga.earned_epoch,
ga.offline
from
achievement a
LEFT OUTER JOIN gamer_achievement ga
ON a.id = ga.achievement_id
AND a.game_id = ga.game_id
AND ga.gamer_id = 'fba8fcaa-f57b-44c6-9431-4ab78605b024'
where
a.game_id = '1480656849'
order by
convert (a.id, unsigned)
Notice the direct relation between "a" and "ga" by the common ID and game ID values, but then tacked on the specific gamer. The where clause only cares at the outer level of achievement based on the specific game.
Upvotes: 17
Reputation: 53830
WHERE
clauses filter results from the entire result set. If you want to apply a filter only to the JOIN
, then you can add the expression to the ON
clause.
In the following query, I've moved the filter expression that applies to the joined table (ga.gamer_id =
) from the WHERE clause to the ON clause. This prevents the expression from filtering out rows where gamer_achievement values are NULL.
SELECT a.id,
a.name,
ga.earned_epoch,
ga.offline
FROM achievement a
LEFT OUTER JOIN gamer_achievement ga
ON ga.achievement_id = a.id
AND ga.game_id = a.game_id
AND ga.gamer_id = 'fba8fcaa-f57b-44c6-9431-4ab78605b024'
WHERE
a.game_id = '1480656849'
ORDER BY CONVERT(a.id, UNSIGNED)
Upvotes: 3
Reputation: 459
My guess is that the where clause is filtering out your desired results, moving it to the left join may work.
select a.id,
a.name,
ga.earned_epoch,
ga.offline
from achievement a
LEFT OUTER JOIN gamer_achievement ga
ON (a.id = ga.achievement_id and
a.game_id = ga.game_id and
ga.gamer_id = 'fba8fcaa-f57b-44c6-9431-4ab78605b024' and
a.game_id = '1480656849')
order by convert (a.id, unsigned)
Upvotes: 0
Reputation: 50835
It's because of this line:
where ga.gamer_id = 'fba8fcaa-f57b-44c6-9431-4ab78605b024'
If the gamer
hasn't earned the achievement
, the ga.gamer_id
value will be NULL
and not qualify for the WHERE
condition.
Upvotes: 2
Reputation: 79185
In the WHERE clause you discard some rows that the LEFT JOIN would have filled with NULL values. You want to put the condition ga.gamer_id = 'fba8fcaa-f57b-44c6-9431-4ab78605b024'
inside the JOIN clause.
Another option is:
LEFT OUTER JOIN (SELECT * FROM gamer_achievement
WHERE ga.gamer_id = 'fba8fcaa-f57b-44c6-9431-4ab78605b024'
) ga
Remember that the join is performed, and at this time, NULL values come if the condition cannot be met; then the where
filter applies.
Upvotes: 9