Reputation: 23
CREATE TABLE EMPLOYEE (
empId INTEGER AUTO_INCREMENT PRIMARY KEY,
name TEXT NOT NULL,
dept TEXT NOT NULL
);
INSERT INTO EMPLOYEE(name, dept) VALUES ('Clark', 'Sales');
INSERT INTO EMPLOYEE(name, dept) VALUES ('Dave', 'Accounting');
INSERT INTO EMPLOYEE(name, dept) VALUES ('Ava', 'Sales');
SELECT *
FROM EMPLOYEE AS a
INNER JOIN EMPLOYEE b
ON a.empId = (SELECT MIN(b.empId));
Output:
+-------+-------+------------+-------+-------+------------+
| empId | name | dept | empId | name | dept |
+-------+-------+------------+-------+-------+------------+
| 1 | Clark | Sales | 1 | Clark | Sales |
| 2 | Dave | Accounting | 2 | Dave | Accounting |
| 3 | Ava | Sales | 3 | Ava | Sales |
+-------+-------+------------+-------+-------+------------+
I expect the query to return only the row with empId = 1.
Why does it returns all rows?
How is the INNER JOIN evaluating the ON clause with the subquery?
What is the correct behavior here?
How can I fix this to only get the row with the smallest empId?
Upvotes: 1
Views: 41
Reputation: 562771
Using aggregation functions in the ON
clause of a join doesn't do anything useful, because in that position, b.empId
doesn't refer to a set of rows, it only refers to one row at a time. So MIN(b.empId)
is not looking for the minimum out of all values of empId
, it's only looking for the minimum of the single value of b.empId
that is current at the time the join expression is evaluated, which happens repeatedly, one row at a time.
For the same reason, you can't use aggregate functions in the WHERE
clause:
... WHERE a.empId = MIN(b.empId) -- ERROR
The bottom line is that you may use aggregation functions in the select-list, or the HAVING
clause, or ORDER BY
clause, but not any other clause.
You tried to use a subquery to put the MIN(b.empId)
in a select-list, but since b.empId
refers to a value of the outer query, and your subquery doesn't have a FROM
clause of its own, it is really a single value within the scope of one row being evaluated for the join.
An easier way to get the employee with the least empId
is:
SELECT *
FROM EMPLOYEE AS a
ORDER BY empId
LIMIT 1
Upvotes: 3
Reputation: 736
An SQL join needs to directly link the columns of two tables, "a" and "b", and may then further limit the rows by adding additional filters e.g.
SELECT *
FROM EMPLOYEE AS a
INNER JOIN EMPLOYEE b
ON a.empId = b.empId
AND a.empId = (SELECT MIN(empId) FROM EMPLOYEE);
Output:
1 Clark Sales 1 Clark Sales
The subquery is processed first by the DMS and must not reference the outer tables.
Upvotes: 1