GaryBox
GaryBox

Reputation: 33

SQL: Select Distinct on 1 column

I need to select distinct values but only for 1 column and the other columns need to show the latest record, i.e.:

customerID    Order Number    Order Date
00001         1000011         2017-01-01
00001         1000022         2017-01-10
00001         1000033         2017-02-01
00002         2000011         2016-12-01
00002         2000022         2017-01-01
00003         3000011         2017-03-01

I would need this to show as:

 customerID   Order Number    Order Date
 00001        1000033         2017-02-01
 00002        2000022         2017-01-01
 00003        3000011         2017-03-01

In Postgresql I would have used SELECT DISTINCT ON (customerID) then ordered by Order Date desc but this isn't possible in SQL Server.

I have tried using the Max function on Order Date, but this still return duplicates in Customer ID when applied like below:

SELECT DISTINCT [CustomerID], [Order No], Max([Order Date])
FROM [T.ORDERS]
GROUP BY [CustomerID], [JOBNO]

Upvotes: 2

Views: 3530

Answers (5)

dotNET
dotNET

Reputation: 35380

If your table is not huge, you could try something like this:

SELECT [CustomerID], SUBSTRING(Dummy, 0, CHARINDEX('*', Dummy) - 1) AS [Order Date],
SUBSTRING(Dummy, CHARINDEX('*', Dummy), LEN(Dummy) - CHARINDEX('*', Dummy)) AS [Order No],
FROM (
      SELECT [CustomerID], 
      Max(CONVERT(varchar, [Order Date], 101) + '*' + CAST([Order No] as varchar)) AS Dummy
      FROM [T.ORDERS] GROUP BY [CustomerID]
      )

What it is doing is to join Order Date and Order No fields with * character (which hopefully doesn't occur anywhere in either column data) and then pick its max value within each group. In the outer SELECT, we then split the max value on the * character to get back the two values.

Upvotes: 0

Dmitry Cat
Dmitry Cat

Reputation: 475

you may try an top 1 with ties function with ROW_NUMBER:

with Data as (
select '00001' customerID,         1000011 orderNumber,         cast('20170101'  as date)  orderdate union all
select '00001' customerID,         1000022 orderNumber,         '20170110' orderdate union all
select '00001' customerID,         1000033 orderNumber,         '20170201' orderdate union all
select '00002' customerID,         2000011 orderNumber,         '20161201' orderdate union all
select '00002' customerID,         2000022 orderNumber,         '20170101' orderdate union all
select '00003' customerID,         3000011 orderNumber,         '20170301' orderdate)
select top 1 with ties 
    CustomerID,
    OrderNumber,
    OrderDate 
from Data
order by 
    ROW_NUMBER() OVER (partition by CustomerID order by orderdate desc)

result:

CustomerID  orderNumber orderdate
00001       1000033     2017-02-01
00002       2000022     2017-01-01
00003       3000011     2017-03-01

Upvotes: 0

You may have this query

SELECT DISTINCT customerID, MAX(OrderNumber), MAX(OrderDate) FROM table;

distinct is faster than group by

Upvotes: 0

DhruvJoshi
DhruvJoshi

Reputation: 17126

You can use JOIN too

SELECT 
   A.[CustomerID], A.[Order No], A.[Order Date]
FROM [T.ORDERS] A INNER JOIN 
( 
  SELECT 
     [CustomerID], Max([Order Date])
  FROM [T.ORDERS]  
  GROUP BY A.[CustomerID], [JOBNO]
) B
ON A.[CustomerID]=B.[CustomerID] AND A.[Order Date]=B.[Order Date]

Upvotes: 2

Kannan Kandasamy
Kannan Kandasamy

Reputation: 13959

You can use row_number as below:

select * from
    (   Select *, RowN = Row_Number() over (partition by CustomerID order by [Order date] desc) from #yourtable ) a
   where a.RowN = 1

Upvotes: 0

Related Questions