Reputation: 107
I have two tables, one is a table of employee names, 176 records. The other is a table (with duplicates) of employee names (same format) and their locations (7943 rows).
From this answer i deduced i needed a left join to give me the rows from Table A only.
I was hoping the below would give me the original 176 rows back from Table A, each column with a value for location from Table B, else blank if not available, however it gives me 7601 rows which i cannot for the life of me understand:
SELECT e.[UniqueName], l.[location]
FROM [Employees] as e
left join Locations as l
on e.[UniqueName] = l.[UniqueName]
Even using a group by (which I'm not sure why this would be necessary given that I am asking only for whats in Table A) gives 172 rows even though each name in the Employees table is unique!
Upvotes: 0
Views: 667
Reputation: 16025
I was hoping the below would give me the original 176 rows back from Table A, each column with a value for location from Table B, else blank if not available, however it gives me 7601 rows which I cannot for the life of me understand...
Whilst a left join
will always return all records from the dataset on the lefthand side of the join, the number of records returned by the query will depend upon the number of possible pairings between the two datasets, which (for a left join
) will always be greater than or equal to the number of records in the dataset to the left of the join.
For your example, consider the following two datasets:
Employees
+------------+
| UniqueName |
+------------+
| Alice |
| Bob |
| Charlie |
+------------+
Locations
+------------+----------+
| UniqueName | Location |
+------------+----------+
| Alice | London |
| Bob | Berlin |
| Bob | New York |
| Bob | Paris |
+------------+----------+
Evaluating the query:
select
e.[uniquename], l.[location]
from
[employees] as e left join locations as l
on e.[uniquename] = l.[uniquename]
Will cause the records to be paired up in the following manner:
And will therefore return the result:
+------------+----------+
| uniquename | location |
+------------+----------+
| Alice | London |
| Bob | Berlin |
| Bob | New York |
| Bob | Paris |
| Charlie | |
+------------+----------+
Upvotes: 1
Reputation: 164184
The table Locations
contains more than 1 locations for each employee and this is why you get so many rows in the results.
If you want just 1 location and it does not matter which 1 then add aggregation to your query:
SELECT e.[UniqueName], MAX(l.[location]) AS location
FROM [Employees] as e
LEFT JOIN Locations as l
ON e.[UniqueName] = l.[UniqueName]
GROUP BY e.[UniqueName]
You can use MIN()
instead of MAX()
.
Upvotes: 1
Reputation: 1270873
You can use a correlated subquery:
SELECT e.[UniqueName],
(SELECT TOP 1 l.[location]
FROM locations as l
WHERE e.[UniqueName] = l.[UniqueName]
) as location
FROM [Employees] as e;
Note: There is no ORDER BY
so this returns an arbitrary location.
If location
can be duplicated for a given UniqueName
, you will get an error. To solve that, you can use an aggregation functions:
SELECT e.[UniqueName],
(SELECT MAX(l.[location])
FROM locations as l
WHERE e.[UniqueName] = l.[UniqueName]
) as location
FROM [Employees] as e;
Upvotes: 0