Ravneet Kaur
Ravneet Kaur

Reputation: 21

DISTINCT not giving expected result

ID  FirstName   LastName    Gender  Salary
1   Ben         Hoskins     Male    70000
2   Mark        Hastings    Male    60000
4   Ben         Hoskins     Male    70000
8   John        Stanmore    Male    80000

While running the query:

select *
from Employees
where  Salary > (SELECT AVG(distinct SALARY) FROM employees)

It is displaying 2 records for 7000 it should display one. Why is it showing 2 records if I have used distinct?

Upvotes: 0

Views: 242

Answers (2)

Gordon Linoff
Gordon Linoff

Reputation: 1269713

You have used distinct the subquery, not in the outer query, so the outer query still sees the data with the duplicates.

You could solve this using:

select e.FirstName, e.LastName, e.Gender, e.Salary
from Employees e
where e.Salary > (SELECT AVG(distinct e2.SALARY) FROM employees e2);

However, something seems seriously wrong with the data model if you have rows with such duplicates. The data should be fixed.

In the meantime, you can work around the problem. You can phrase your query using a CTE that eliminates the duplicates:

with e as (
      select e.*
      from (select e.*,
                   row_number(). over (partition by firstname, lastname, gender order by id desc) as seqnum
            from employees e
           )
      where seqnum = 1
     )
select e.*
from e
where e.salary > (select avg(salary) from e)

Upvotes: 1

GMB
GMB

Reputation: 222442

It looks like you have duplicate rows in the table (that only differ by their id), so I am going to assume that you want distinct over the name, gender and salary rather than the salary only.

You seem to want distinct both in the outer and inner query:

select distinct firstname, lastname, gender, salary
from employees
where salary > (
    select avg(salary)
    from (
        select distinct firstname, lastname, gender, salary
        from employees
    ) e
)

If your database supports window functions, this can be shortened:

select *
from (
    select e.*, avg(salary) over() as avg_salary
    from (
        select distinct firstname, lastname, gender, salary
        from employees
    ) e
) e
where salary > avg_salary

Upvotes: 1

Related Questions