Reputation: 131
I am learning about SQL joins from this tutorial: http://www.programmerinterview.com/index.php/database-sql/what-is-a-self-join/
The table they are using is:
+---------------+-------------------+
| employee_name | employee_location |
+---------------+-------------------+
| Joe | New York |
| Sunil | India |
| Alex | Russia |
| Albert | Canada |
| Jack | New York |
+---------------+-------------------+
I decided to experiment with the queries by leaving out certain parts like this:
select * from
employee t1, employee t2
where t1.employee_location = t2.employee_location
This query seemed pretty straightforward. "Match up people with the same location. I thought ok, the result will be this:
+---------------+-------------------++---------------+-------------------+
| employee_name | employee_location || employee_name | employee_location |
+---------------+-------------------++---------------+-------------------+
| Joe | New York || Joe | New York |
| Sunil | India || Sunil | India |
| Alex | Russia || Alex | Russia |
| Albert | Canada || Albert | Canada |
| Jack | New York || Jack | New York |
+---------------+-------------------++---------------+-------------------+
But instead of the above I got this:
+---------------+-------------------+---------------+-------------------+
| employee_name | employee_location | employee_name | employee_location |
+---------------+-------------------+---------------+-------------------+
| Joe | New York | Joe | New York |
| Jack | New York | Joe | New York |
| Sunil | India | Sunil | India |
| Alex | Russia | Alex | Russia |
| Albert | Canada | Albert | Canada |
| Joe | New York | Jack | New York |
| Jack | New York | Jack | New York |
+---------------+-------------------+---------------+-------------------+
It works fine for Sunil, Alex and Albert. The combinations of Joe and Jack I do not understand. Can someone explain in detail how this works aside from the solution? I would like to understand the logic behind this result.
Upvotes: 0
Views: 102
Reputation: 17915
Each of the New York rows from t1
matches against two rows in t2
since all you're requiring is that the cities be the same. So you get 2 x 2 total matches (pairs) for New York. If you added a third New York then you'd get 9 rows.
To see the behavior you're expecting try adding and t1.employee_name = t2.employee.name
. You can't uniquely identify all people by location alone even though with most of them you can and so your result appears to almost work.
Upvotes: 1
Reputation: 76
The answer lies in within the "where" statement: where t1.employee_location = t2.employee_location Joe and Jack both match New York so are multiplied. You can see this better by adding another New York entry and re-running. You will get 9 New York results. To get what you expected you would need to add "and where t1.employee_name = t2.employee_name.
Upvotes: 0