SQL Server : select correct columns of duplicated records with group by clause

Question

I have the following SQL Server table:

DECLARE @Tbl TABLE
(    
    Id INT IDENTITY NOT NULL PRIMARY KEY,
    MemberId INT,
    Guid UNIQUEIDENTIFIER,
    ExpiryYear INT,
    Hash CHAR(5)
)   

INSERT INTO @Tbl (MemberId, Guid, ExpiryYear, Hash)       
VALUES
(1, NEWID(), 2017, 'Hash1'),
(1, NEWID(), 2018, 'Hash2'),
(1, NEWID(), 2018, 'Hash2'),
(2, NEWID(), 2020, 'Hash3'),
(2, NEWID(), 2017, 'Hash1')

If I need data grouped by the ExpiryYear and Hash, how to get also a record that matches the Id and Guid? I've tried something like:

select 
    max(Id) as Id, 
    max(MemberId) as MemberId, 
    max(Guid) as Guid, 
    ExpiryYear, 
    Hash 
from 
    @Tbl 
where 
    MemberId = 1 
group by 
    ExpiryYear, Hash

But this usually gives me the id of the third record and guid of the second. How to make that I receive the id and guid of the same record?

Here is a wrong output because the id is from the third record, but Guid is from the second record:

I am using SQL Server 2014

Gordon Linoff · Accepted Answer

I think you want something like this:

select . . .
from (select t.*,
             row_number() over (partition by ExpiryYear, Hash order by ??) as seqnum
      from @tbl t
     ) t
where seqnum = 1;

The ?? represents the ordering column, which specifies which row to get for the (ExpiryYear, Hash) combination. If you don't care, you can use ExprityYear or (select NULL).

SQL Server : select correct columns of duplicated records with group by clause

Answers (1)

Related Questions