user3809875
user3809875

Reputation: 131

MySql self join unexpected result inner join

I am learning about SQL joins from this tutorial: http://www.programmerinterview.com/index.php/database-sql/what-is-a-self-join/

The table they are using is:

+---------------+-------------------+
| employee_name | employee_location |
+---------------+-------------------+
| Joe           | New York          |
| Sunil         | India             |
| Alex          | Russia            |
| Albert        | Canada            |
| Jack          | New York          |
+---------------+-------------------+

I decided to experiment with the queries by leaving out certain parts like this:

select * from
employee t1, employee t2
where t1.employee_location = t2.employee_location

This query seemed pretty straightforward. "Match up people with the same location. I thought ok, the result will be this:

+---------------+-------------------++---------------+-------------------+
| employee_name | employee_location || employee_name | employee_location |
+---------------+-------------------++---------------+-------------------+
| Joe           | New York          || Joe           | New York          |
| Sunil         | India             || Sunil         | India             |
| Alex          | Russia            || Alex          | Russia            |
| Albert        | Canada            || Albert        | Canada            |
| Jack          | New York          || Jack          | New York          |
+---------------+-------------------++---------------+-------------------+

But instead of the above I got this:

+---------------+-------------------+---------------+-------------------+
| employee_name | employee_location | employee_name | employee_location |
+---------------+-------------------+---------------+-------------------+
| Joe           | New York          | Joe           | New York          |
| Jack          | New York          | Joe           | New York          |
| Sunil         | India             | Sunil         | India             |
| Alex          | Russia            | Alex          | Russia            |
| Albert        | Canada            | Albert        | Canada            |
| Joe           | New York          | Jack          | New York          |
| Jack          | New York          | Jack          | New York          |
+---------------+-------------------+---------------+-------------------+

It works fine for Sunil, Alex and Albert. The combinations of Joe and Jack I do not understand. Can someone explain in detail how this works aside from the solution? I would like to understand the logic behind this result.

Upvotes: 0

Views: 102

Answers (2)

shawnt00
shawnt00

Reputation: 17915

Each of the New York rows from t1 matches against two rows in t2 since all you're requiring is that the cities be the same. So you get 2 x 2 total matches (pairs) for New York. If you added a third New York then you'd get 9 rows.

To see the behavior you're expecting try adding and t1.employee_name = t2.employee.name. You can't uniquely identify all people by location alone even though with most of them you can and so your result appears to almost work.

Upvotes: 1

DTecMeister
DTecMeister

Reputation: 76

The answer lies in within the "where" statement: where t1.employee_location = t2.employee_location Joe and Jack both match New York so are multiplied. You can see this better by adding another New York entry and re-running. You will get 9 New York results. To get what you expected you would need to add "and where t1.employee_name = t2.employee_name.

Upvotes: 0

Related Questions