Bryan Gerrells
Bryan Gerrells

Reputation: 23

SQL Query for removing duplicate values

I have a table that has

Account, CompanyID, Mail_Address
1001,1,Main ST
1001,2,Main ST
1001,3,Florida ST
1002,1,South Main
1002,2,West Main
1002,3,Soth Main
1003,1,North Main
1003,2,North Main

I need to create a query that removes duplicate Mail_Address, which is easy enough. The problem is that the results returned are the last of the duplicated value, instead of the first duplicated value.

If my table was the above the results would be

 Account, CompanyID, Mail_Address
    1001,2,Main ST
    1001,3,Florida ST
    1002,2,West Main
    1002,3,South Main
    1003,2,North Main

Notice the query deleted the first records and kept the last of the duplicated values.

I'd perfer the outcome to be

 Account, CompanyID, Mail_Address
    1001,1,Main ST
    1001,3,Florida ST
    1002,1,South Main
    1002,2,West Main
    1003,1,North Main

My Query

 SELECT *
  FROM (SELECT  [ACCOUNT]
      ,[COMPANY]
      ,[MAIL_ADDRESS]
      , 

      ROW_NUMBER() OVER(PARTITION BY Mail_Address ORDER BY Account ASC) rn
                    FROM [SQLUserDB].[dbo].[DSDD01]                 

              ) a
WHERE rn = 1 

Upvotes: 0

Views: 47

Answers (4)

Gordon Bell
Gordon Bell

Reputation: 13633

Just to show you another way, using inner join of table on itself:

-- List Mail_Address to Delete
select t1.Account, t1.CompanyID, t1.Mail_Address
from [SQLUserDB].[dbo].[DSDD01] t1
inner join [SQLUserDB].[dbo].[DSDD01] t2
on t1.Account = t2.Account
and t1.Mail_Address = t2.Mail_Address
where t1.CompanyID > t2.Company_ID

-- Delete Duplicate Mail_Address
delete t1
from [SQLUserDB].[dbo].[DSDD01] t1
inner join [SQLUserDB].[dbo].[DSDD01] t2
on t1.Account = t2.Account
and t1.Mail_Address = t2.Mail_Address
where t1.CompanyID > t2.Company_ID

Upvotes: 0

Tahbaza
Tahbaza

Reputation: 9548

If you're after the lowest company value this will work:

SELECT Account, MIN(Company) as Company, Mail_Address
FROM SqlUserDb.dbo.DSDD01
GROUP BY Account, Mail_Address

If you're after the first listed company value this will do it, very close to what you had:

SELECT Account, Company, Mail_Address
FROM (
  SELECT Account, Company, Mail_Address,
    ROW_NUMBER() OVER(PARTITION BY Mail_Address ORDER BY Account, Company) CompanyRowNo
  FROM SqlUserDb.dbo.DSDD01
) x
WHERE x.CompanyRowNo = 1

Upvotes: 0

Lajos Arpad
Lajos Arpad

Reputation: 76426

You can achieve your intention using not exists:

select Account, CompanyID, Mail_Address
FROM [SQLUserDB].[dbo].[DSDD01] t1
where not exists (select 1
                  from [SQLUserDB].[dbo].[DSDD01] t2
                  where t1.Mail_Address = t2.Mail_Address and t2.CompanyID < t1.CompanyID)

Upvotes: 0

Lamak
Lamak

Reputation: 70638

Just add another condition to the ORDER BY:

 SELECT *
  FROM (SELECT  [ACCOUNT]
      ,[COMPANY]
      ,[MAIL_ADDRESS]
      , 

      ROW_NUMBER() OVER(PARTITION BY Mail_Address ORDER BY Account ASC, CompanyID) rn
                    FROM [SQLUserDB].[dbo].[DSDD01]                 

              ) a
WHERE rn = 1 

Upvotes: 1

Related Questions