Reputation: 63
I'm trying to return unique 'References' only, but also need to pull back the 'Surname' and 'Email' which may also be unique. In my example you can have multiple customers associated to a reference. Currently my query pulls back all the customers associated to the 'Reference'. As I only need the details for one of the customers, I only want to pull back the 'Reference' once with a 'Surname' and 'Email'.
The below query is an example but it relates to the scenario I'm trying to achieve.
In the example the 'UniqueID' is essentially the order in which the items were added to the table. I require this so I can filter by the most recent records. As the 'Reference' is not increment based I cannot use this.
So, in essence I want to count how many instances of a duplicate 'Reference' BUT only return one instance of 'Reference' plus the corresponding 'Email' and 'Surname' and then order by the most recent records added.
A flavour of what is in each table:
CustomerPackage: UniqueID (INT); Reference (STRING)
Customer: FirstName (STRING); Surname (STRING); Email (STRING); Address (STRING)
PurchaseDetails: PurchaseDate (DATE); PurchaseDescription (STRING); PurchaseType (INT)
SQL Script Example:
USE Example
GO
SELECT TOP 10
CP.Reference,
C.Surname,
C.Email
FROM CustomerPackage CP
INNER JOIN Customer C ON CP.UniqueID = C.UniqueID
INNER JOIN PurchaseDetails PD ON CP.UniqueID = PD.UniqueID
WHERE PD.PurchaseDate > dateadd(HOUR,10,getutcdate()) and PD.PurchaseDate < dateadd(DAY,29,getutcdate()) and PD.PurchaseType = 1
ORDER BY CP.UniqueID DESC
Output:
Reference Surname Email
1XX45 Smith [email protected]
1XX45 Jones [email protected]
1XX45 Betty [email protected]
4B678 Reeds [email protected]
=======================
Required output:
Reference Surname Email
1XX45 Smith [email protected]
4B678 Reeds [email protected]
Upvotes: 0
Views: 51
Reputation: 521249
If what you want is the single record with the most recent purchase date for each reference group of records, then you can use ROW_NUMBER
:
SELECT Reference, Surname, Email
FROM
(
SELECT
CP.Reference,
CP.UniqueID,
C.Surname,
C.Email,
ROW_NUMBER() OVER (PARTITION BY CP.Reference ORDER BY PD.PurchaseDate DESC) rn
FROM CustomerPackage CP
INNER JOIN Customer C
ON CP.UniqueID = C.UniqueID
INNER JOIN PurchaseDetails PD
ON CP.UniqueID = PD.UniqueID
WHERE
PD.PurchaseDate > dateadd(HOUR,10,getutcdate()) AND
PD.PurchaseDate < dateadd(DAY,29,getutcdate()) AND
PD.PurchaseType = 1
) t
WHERE rn = 1
ORDER BY
UniqueID DESC;
Upvotes: 0
Reputation: 3837
I think all you're missing is Row_Number
;WITH cteX
AS(
SELECT TOP 10
RN = ROW_NUMBER()OVER(PARTITION BY CP.Reference ORDER BY CP.UniqueId DESC),
CP.Reference,
C.Surname,
C.Email,
CP.UniqueId
FROM CustomerPackage CP
INNER JOIN Customer C ON CP.UniqueID = C.UniqueID
INNER JOIN PurchaseDetails PD ON CP.UniqueID = PD.UniqueID
WHERE PD.PurchaseDate > dateadd(HOUR,10,getutcdate()) and PD.PurchaseDate < dateadd(DAY,29,getutcdate()) and PD.PurchaseType = 1
)
SELECT
Reference
,Surname
,Email
FROM cteX X
WHERE X.RN = 1
ORDER BY UniqueId DESC
Upvotes: 1
Reputation: 333
What you are looking for is called a GROUP BY condition. In your situation you want to group by CP.Reference. This will cause all data for each reference to be summarized. The result of this will be that your query won't know what data to show for surname and email (1XX45 has 3 different surnames and e-mails). In your case you just want to show one surname or email so you can just select the highest value of these records. Your query will then become:
USE Example
GO
SELECT TOP 10
CP.Reference,
MAX(C.Surname),
MAX(C.Email)
FROM CustomerPackage CP
INNER JOIN Customer C ON CP.UniqueID = C.UniqueID
INNER JOIN PurchaseDetails PD ON CP.UniqueID = PD.UniqueID
WHERE PD.PurchaseDate > dateadd(HOUR,10,getutcdate()) and PD.PurchaseDate < dateadd(DAY,29,getutcdate()) and PD.PurchaseType = 1
ORDER BY CP.UniqueID DESC
GROUP BY CP.Reference
Upvotes: 0