Reputation: 21
ID FirstName LastName Gender Salary
1 Ben Hoskins Male 70000
2 Mark Hastings Male 60000
4 Ben Hoskins Male 70000
8 John Stanmore Male 80000
While running the query:
select *
from Employees
where Salary > (SELECT AVG(distinct SALARY) FROM employees)
It is displaying 2 records for 7000 it should display one. Why is it showing 2 records if I have used distinct?
Upvotes: 0
Views: 242
Reputation: 1269713
You have used distinct
the subquery, not in the outer query, so the outer query still sees the data with the duplicates.
You could solve this using:
select e.FirstName, e.LastName, e.Gender, e.Salary
from Employees e
where e.Salary > (SELECT AVG(distinct e2.SALARY) FROM employees e2);
However, something seems seriously wrong with the data model if you have rows with such duplicates. The data should be fixed.
In the meantime, you can work around the problem. You can phrase your query using a CTE that eliminates the duplicates:
with e as (
select e.*
from (select e.*,
row_number(). over (partition by firstname, lastname, gender order by id desc) as seqnum
from employees e
)
where seqnum = 1
)
select e.*
from e
where e.salary > (select avg(salary) from e)
Upvotes: 1
Reputation: 222442
It looks like you have duplicate rows in the table (that only differ by their id
), so I am going to assume that you want distinct
over the name, gender and salary rather than the salary only.
You seem to want distinct
both in the outer and inner query:
select distinct firstname, lastname, gender, salary
from employees
where salary > (
select avg(salary)
from (
select distinct firstname, lastname, gender, salary
from employees
) e
)
If your database supports window functions, this can be shortened:
select *
from (
select e.*, avg(salary) over() as avg_salary
from (
select distinct firstname, lastname, gender, salary
from employees
) e
) e
where salary > avg_salary
Upvotes: 1