user1777929
user1777929

Reputation: 797

SQL Server 2014 - How to find Distinct records

I have a list of Contact that have First Names, Last Names and Email addresses. Some Email addresses have multiple First and Last Names. I am more concerned about the Email Addresses. I really just want the top name for that Email Address.

My Code doesn't work obviously:

SELECT Salutation  
, FirstName  
, LastName  
, EmailAddress  
FROM Contact  
--GROUP BY EmailAddress  ---I know a Group by will surely help

I have tried

SELECT max(Salutation)
    ,max(FirstName)
    ,max(LastName)
    ,max(EMailAddress)
FROM Contact
WHERE EMailAddress NOT LIKE ''
GROUP BY EMailAddress

This worked but I'm wondering if there is a better way to do it.

Upvotes: 1

Views: 42

Answers (2)

SqlZim
SqlZim

Reputation: 38023

How do you define top name?

Using max() for names can easily return results with mixed names, e.g. 'Aaron Bertrand' and 'Itzik Ben-Gan' would return 'Itzik Bertrand'. If you have mixed Salutation then you will always get 'Mrs.' from 'Mr.' and 'Mrs.' which might not be appropriate either.


using top with ties and row_number():

select top 1 with ties
    Salutation
  , FirstName
  , LastName
  , EmailAddress
from contact
where EmailAddress <> ''
order by row_number() over (
  partition by EmailAddress
  order by FirstName /* your 'top' criteria here, FirstName is a placeholder */
  );

cross apply() version:

select distinct
    x.Salutation
  , x.FirstName
  , x.LastName
  , t.EmailAddress
from contact t
  cross apply (
    select top 1
        i.Salutation
      , i.FirstName
      , i.LastName
    from t as i
    where i.EmailAddress = t.EmailAddress
    order by i.FirstName
    ) as x
where t.EmailAddress <> ''

common table expression with row_number() version:

;with cte as (
  select *
    , rn = row_number() over (
             partition by EmailAddress
                 order by FirstName
            )
  from contact
  where EmailAddress <> ''
)
select  
    Salutation
  , FirstName
  , LastName
  , EmailAddress
from cte
where rn = 1;

I prefer using common table expressions, but the query inside it works just as well in the from clause:

row_number() in a subquery version:

select  
    Salutation
  , FirstName
  , LastName
  , EmailAddress
from (
  select *
    , rn = row_number() over (
             partition by EmailAddress
                 order by FirstName
            )
    from contact
    where EmailAddress <> ''
  ) s
where rn = 1;

Upvotes: 1

JBrooks
JBrooks

Reputation: 10013

Try:

SELECT Salutation  
, FirstName  
, LastName  
, EmailAddress  
FROM Contact  
WHERE EmailAddress IS NULL 
      OR ID IN
      (SELECT MAX(ID)
       FROM Contact  
       WHERE EmailAddress IS NOT NULL
       GROUP BY EmailAddress)

This will give you the last one added for each email address.

Upvotes: 0

Related Questions