Scott
Scott

Reputation: 63

SQL - Retrieve Unique Values of single column where by multiple columns may be unique

I'm trying to return unique 'References' only, but also need to pull back the 'Surname' and 'Email' which may also be unique. In my example you can have multiple customers associated to a reference. Currently my query pulls back all the customers associated to the 'Reference'. As I only need the details for one of the customers, I only want to pull back the 'Reference' once with a 'Surname' and 'Email'.

The below query is an example but it relates to the scenario I'm trying to achieve.

In the example the 'UniqueID' is essentially the order in which the items were added to the table. I require this so I can filter by the most recent records. As the 'Reference' is not increment based I cannot use this.

So, in essence I want to count how many instances of a duplicate 'Reference' BUT only return one instance of 'Reference' plus the corresponding 'Email' and 'Surname' and then order by the most recent records added.

A flavour of what is in each table:

CustomerPackage: UniqueID (INT); Reference (STRING)
Customer: FirstName (STRING); Surname (STRING); Email (STRING); Address (STRING)
PurchaseDetails: PurchaseDate (DATE); PurchaseDescription (STRING); PurchaseType (INT)

SQL Script Example:

USE Example
GO

SELECT TOP 10

CP.Reference,
C.Surname,
C.Email

FROM CustomerPackage CP

INNER JOIN Customer C ON CP.UniqueID = C.UniqueID
INNER JOIN PurchaseDetails PD ON CP.UniqueID = PD.UniqueID

WHERE PD.PurchaseDate > dateadd(HOUR,10,getutcdate()) and PD.PurchaseDate < dateadd(DAY,29,getutcdate()) and PD.PurchaseType = 1

ORDER BY CP.UniqueID DESC

Output:

Reference   Surname     Email
1XX45       Smith       [email protected]
1XX45       Jones       [email protected]
1XX45       Betty       [email protected]
4B678       Reeds       [email protected]

=======================

Required output:

Reference   Surname     Email
1XX45       Smith       [email protected]
4B678       Reeds       [email protected]

Upvotes: 0

Views: 51

Answers (3)

Tim Biegeleisen
Tim Biegeleisen

Reputation: 521249

If what you want is the single record with the most recent purchase date for each reference group of records, then you can use ROW_NUMBER:

SELECT Reference, Surname, Email
FROM
(
    SELECT
        CP.Reference,
        CP.UniqueID,
        C.Surname,
        C.Email,
        ROW_NUMBER() OVER (PARTITION BY CP.Reference ORDER BY PD.PurchaseDate DESC) rn
    FROM CustomerPackage CP
    INNER JOIN Customer C
        ON CP.UniqueID = C.UniqueID
    INNER JOIN PurchaseDetails PD
        ON CP.UniqueID = PD.UniqueID
    WHERE
        PD.PurchaseDate > dateadd(HOUR,10,getutcdate()) AND
        PD.PurchaseDate < dateadd(DAY,29,getutcdate()) AND
        PD.PurchaseType = 1
) t
WHERE rn = 1
ORDER BY
    UniqueID DESC;

Upvotes: 0

Mazhar
Mazhar

Reputation: 3837

I think all you're missing is Row_Number

;WITH cteX
AS(
    SELECT TOP 10
    RN = ROW_NUMBER()OVER(PARTITION BY CP.Reference ORDER BY CP.UniqueId DESC),
    CP.Reference,
    C.Surname,
    C.Email,
    CP.UniqueId

    FROM CustomerPackage CP

    INNER JOIN Customer C ON CP.UniqueID = C.UniqueID
    INNER JOIN PurchaseDetails PD ON CP.UniqueID = PD.UniqueID

    WHERE PD.PurchaseDate > dateadd(HOUR,10,getutcdate()) and PD.PurchaseDate < dateadd(DAY,29,getutcdate()) and PD.PurchaseType = 1
)
SELECT
    Reference
    ,Surname
    ,Email
FROM cteX X
WHERE X.RN = 1
ORDER BY UniqueId DESC

Upvotes: 1

Stefan
Stefan

Reputation: 333

What you are looking for is called a GROUP BY condition. In your situation you want to group by CP.Reference. This will cause all data for each reference to be summarized. The result of this will be that your query won't know what data to show for surname and email (1XX45 has 3 different surnames and e-mails). In your case you just want to show one surname or email so you can just select the highest value of these records. Your query will then become:

USE Example
GO

SELECT TOP 10

CP.Reference,
MAX(C.Surname),
MAX(C.Email)

FROM CustomerPackage CP

INNER JOIN Customer C ON CP.UniqueID = C.UniqueID
INNER JOIN PurchaseDetails PD ON CP.UniqueID = PD.UniqueID

WHERE PD.PurchaseDate > dateadd(HOUR,10,getutcdate()) and PD.PurchaseDate < dateadd(DAY,29,getutcdate()) and PD.PurchaseType = 1
ORDER BY CP.UniqueID DESC
GROUP BY CP.Reference

Upvotes: 0

Related Questions