Theflyingtaz
Theflyingtaz

Reputation: 23

SQL exists returns all rows

I have a mailing list and there are many entries that are duplicates. I'm trying to find the duplicates so I can remove them. When I run the full query below I get all of the items in the table (142,000 +). When I run the subquery I get only 5768 rows. I need all the columns for each duplicate row to decide which one gets deleted. What am I doing wrong that causes the full query to return all the rows?

select * from Mailinglist
where exists
(select count(*), mailaddress, CenterName
from Mailinglist
group by MailAddress, CenterName
having count(*)>1)

Upvotes: 1

Views: 2775

Answers (4)

lvoros
lvoros

Reputation: 312

There's no filtering (where) in your subquery so it will allways result something.

SELECT        *
FROM            Mailinglist AS ML
WHERE        EXISTS
        (SELECT        COUNT(*) AS Expr1, mailaddress, CenterName
        FROM            Mailinglist AS CNT
        WHERE        (ML.MailAddress = CNT.MailAddress) AND (ML.CenterName = CNT.CenterName)
        GROUP BY mailaddress, CenterName
        HAVING         (COUNT(*) > 1))

Upvotes: 0

Felix Pamittan
Felix Pamittan

Reputation: 31879

It's because EXISTS returns true if the subquery returns a row. Your subquery, is returning one or more rows, and thus returning TRUE for the EXISTS condition.

To get the MailingList with duplicates, you can just run the subquery:

SELECT
    COUNT(*),
    mailaddress,
    CenterName
FROM Mailinglist
GROUP BY
    MailAddress, CenterName
HAVING COUNT(*) > 1

To delete the duplicates, you can use ROW_NUMBER:

WITH Cte AS(
    SELECT *,
        rn = ROW_NUMBER() OVER(PARTITION BY MailAddress, Centername ORDER BY (SELECT NULL))
    FROM MailingList
)
DELETE FROM Cte WHERE rn > 1

Just replace the ORDER BY clause depending on which row of the duplicates you want to retain.

Upvotes: 0

Chaos Legion
Chaos Legion

Reputation: 2970

EXISTS returns true if there is even a single record in the query following it. What you are looking for is

select * from Mailinglist
where mailaddress IN 
(
    select mailaddress
    from Mailinglist
    group by MailAddress, CenterName
    having count(*)>1
)

Upvotes: 3

Giorgos Betsos
Giorgos Betsos

Reputation: 72165

You have to do it this way:

select t1.*, t2.cnt 
from Mailinglist t1
join (
  select count(*) as cnt, mailaddress, CenterName
  from Mailinglist
  group by MailAddress, CenterName
  having count(*)>1
) t2 ON t1.MailAddress = t2.MailAddress and t1.CenterName = t2.CenterName

Using EXISTS just checks for the existence, or not, of records: if the subquery returns one or more records, then EXISTS evaluates to true.

Upvotes: 3

Related Questions