Reputation: 195
I need to get employees with smallest salary in their departments I did it using anti join.
select emp.employee_id,emp.last_name,emp.salary,emp.department_id
from employees emp
left join employees sml
on sml.department_id = emp.department_id and sml.salary < emp.salary
where sml.employee_id is null and emp.department_id is not null
But I've been told that it's possible to do it using window function using one select. However I can't group it by department_id and use it at the same time. Is that a bug or me being stupid?
SELECT department_id,
min(salary) OVER (partition by department_id) as minsalary
FROM employees;
GROUP BY department_id
SQL Developer says 00979. 00000 - "not a GROUP BY expression"
Upvotes: 19
Views: 74622
Reputation: 1
SELECT t.employee_id, t.department_id, t.last_name, t.salary
FROM (SELECT employee_id, department_id, last_name, salary,
MIN(salary) OVER(PARTITION BY department_id) AS dept_min_salary
FROM employees) t
WHERE t.salary = t.dept_min_salary;
Upvotes: 0
Reputation: 11
Since window function is not a aggregate function, you need to placed non aggregated labels like as department_id, salary under group by. Looking at your question, not suggested to use window function.
SELECT department_id,
min(salary) OVER (partition by department_id) as minsalary
FROM employees;
GROUP BY department_id, salary;
Upvotes: -1
Reputation: 570
First thing to remember is that windowed functions (like OVER()
clause) work on the result of the query. That is: Server first executes the query and only then applies the windowed function as defined by you (of course, it's an oversimplification of what actually happens, but good enough to illustrate my point).
This means that you can actually use windowed function and group by clause in the same query, but you need to encapsulate group by
aggregate with windowed function
aggregate, like this:
SELECT department_id,
min(min(salary)) OVER (partition by department_id) as minsalary
FROM employees
GROUP BY department_id;
However, I agree that this is not a good place to use windowed function. Matt's proposition - which I upvoted, full disclosure - is best here (ROW_NUMBER()
in CTE
or subquery
, then selecting only the desired rows in main SELECT
).
Upvotes: 25
Reputation: 14381
WITH cte AS (
SELECT
emp.*
,ROW_NUMBER() OVER (PARTITION BY emp.department_id ORDER BY emp.salary) as RowNumber
FROM
employees emp
)
SELECT c.*
FROM
cte c
WHERE
c.RowNumber = 1
You can use ROW_NUMBER()
to get 1 row of lowest salary by department as above. If you want all rows in the case of ties switch it to RANK()
Otherwise you can do it with MIN() OVER
but this will give you ties
WITH cte AS (
SELECT
emp.*
,MIN(emp.salary) OVER (PARTITION BY emp.department_id) as DeptMinSalary
FROM
employees emp
)
SELECT c.*
FROM
cte c
WHERE
c.salary = c.DeptMinSalary
As a derived table instead of a Common Table Expression:
SELECT t.*
FROM
(SELECT
emp.*
,ROW_NUMBER() OVER (PARTITION BY emp.department_id ORDER BY emp.salary) as RowNumber
FROM
employees emp) t
WHERE
t.RowNumber = 1
One last thought on the subject because you ask "Can I group by in a SQL query with a window function?" Alex covers that the PARTITION BY
is like a sub grouping within the Window Function. But to use a GROUP BY
grouping with a Window function means that the GROUP BY
result set would be evaluated PRIOR to the Window Function being evaluated.
Upvotes: 6
Reputation: 191580
If you run your second query without the group by
- which you may have already tried, from the extra semicolon in what you posted - you'll see that you get one row for every employee, each showing the minimum salary in their department. That minimum is the analytic min()
because it has a window clause. The PARTITION BY
is the equivalent of a GROUP BY
, but without the aggregation over the whole result set.
The simplest way to get the same result (almost) is to use the RANK()
analytic function instead, which ranks the values based on the partition and order you supply, while allowing for ties:
SELECT employee_id, last_name, salary, department_id,
RANK() OVER (PARTITION BY department_id ORDER BY salary) AS rnk
FROM employees
ORDER BY department_id, rnk;
EMPLOYEE_ID LAST_NAME SALARY DEPARTMENT_ID RNK
----------- ------------------------- ---------- ------------- ----------
200 Whalen 4400 10 1
202 Fay 6000 20 1
201 Hartstein 13000 20 2
119 Colmenares 2500 30 1
118 Himuro 2600 30 2
117 Tobias 2800 30 3
116 Baida 2900 30 4
115 Khoo 3100 30 5
114 Raphaely 11000 30 6
...
102 De Haan 17000 90 1
101 Kochhar 17000 90 1
100 King 24000 90 3
...
For departments 20 and 30 you can see the row ranked 1 is the lowest salary. For department 90 there are two employees ranked 1, because they have the same lowest salary.
You can use that as an inline view and select just those rows ranked number 1:
SELECT employee_id, last_name, salary, department_id
FROM (
SELECT employee_id, last_name, salary, department_id,
RANK() OVER (PARTITION BY department_id ORDER BY salary) AS rnk
FROM employees
)
WHERE rnk = 1
ORDER BY department_id;
EMPLOYEE_ID LAST_NAME SALARY DEPARTMENT_ID
----------- ------------------------- ---------- -------------
200 Whalen 4400 10
202 Fay 6000 20
119 Colmenares 2500 30
203 Mavris 6500 40
132 Olson 2100 50
107 Lorentz 4200 60
204 Baer 10000 70
173 Kumar 6100 80
101 Kochhar 17000 90
102 De Haan 17000 90
113 Popp 6900 100
206 Gietz 8300 110
178 Grant 7000
13 rows selected.
If you didn't have to worry about ties there is an even simpler alternative, but it ins't appropriate here.
Notice that this gives you one more row than your original query. You are joining on sml.department_id = emp.department_id
. If the department ID is null, as it is for employee 178, that join fails because you can't compare null to null with equality tests. Because this solution doesn't have a join, that doesn't apply, and you see that employee in the results.
Upvotes: 14
Reputation: 371
you do not need window function in this case, cause a simple group by
would work too.
And the error is correct, cause the window function isn't an aggregat function. And a window function can't be a Group by- member.
But you could use "distinct" instead.
SELECT DISTINCT department_id,
min(salary) OVER (partition by department_id) as minsalary
FROM employees;
In your Special case all this is oversized, of course. But I think understanding is the name of the game.
Upvotes: 2