Abdulquadir Shaikh
Abdulquadir Shaikh

Reputation: 101

SQL Query to get one row per group based on conditional logic

I have a table which has duplicate values in one of the columns. I am trying to eliminate rows with duplicate entries based on some logic described below.

Here is the table example -

RecoveryKey   DateTime      Duration CallDisposition NewTransaction Variable8   CallTypeID
7994113912466 12/4/18 16:26 19       52              Y              152643-5657     -1    
7994113912470 12/4/18 16:26 1168     29              Y              152643-5657   6390
7994113912751 12/4/18 16:51 2686     13              N              152643-5657   6390
7994113912756 12/4/18 13:51 56       2               Y              152643-5658   6390
7994113912756 12/4/18 13:52 125      13              Y              152643-5658   6390
7994113912756 12/4/18 13:55 125      13              N              152643-5658   6390

The column Variable8 has duplicate values for the first 3 records and for the last 3 records.

What do I want to do is, if CallDisposition is 29 and the next entry after that has a NewTransaction value of "N" then I have to keep the row with CallDisposition = 29.

For all other instance I want to select latest row irrespective of CallDisposition or NewTransaction value.

So my final result should look like this

RecoveryKey   DateTime      Duration CallDisposition NewTransaction Variable8   CallTypeID
7994113912470 12/4/18 16:26 1168     29              Y              152643-5657   6390
7994113912756 12/4/18 13:55 125      13              N              152643-5658   6390

Upvotes: 0

Views: 254

Answers (2)

Ross Bush
Ross Bush

Reputation: 15155

I took the liberty of adding a sequence as I assume there is some order. This will pick up the 29/N combo anywhere in the data.

DECLARE @T TABLE (datetime INT , CallDiusposition INT, NewTransaction NVARCHAR(10), Variable8 NVARCHAR(50))
INSERT @T VALUES(1,52,'Y',5657),(2,29,'Y',5657),(3,13,'N',5657),(4,2,'Y',5658),(5,13,'Y',5658),(6,13,'N',5658)

;WITH A AS
(
    SELECT *,
        RN = CASE WHEN LEAD(NewTransaction) OVER (PARTITION BY Variable8 ORDER BY datetime) ='N' AND CallDiusposition=29  THEN -1 ELSE ROW_NUMBER() OVER (PARTITION BY Variable8 ORDER BY datetime) END
    FROM @T
),
B AS
(
    SELECT Match=ROW_NUMBER() OVER (PARTITION BY Variable8 ORDER BY RN ASC),* FROM A 
)
SELECT * FROM B WHERE Match=1

Without CTE

SELECT * FROM
(
    SELECT Match=ROW_NUMBER() OVER (PARTITION BY Variable8 ORDER BY RN ASC),* FROM
    (
        SELECT *,
            RN = CASE WHEN LEAD(NewTransaction) OVER (PARTITION BY Variable8 ORDER BY datetime) ='N' AND CallDiusposition=29  THEN -1 ELSE ROW_NUMBER() OVER (PARTITION BY Variable8 ORDER BY datetime) END
        FROM @T
    )AS A
)AS B
WHERE Match=1

Upvotes: 0

Gordon Linoff
Gordon Linoff

Reputation: 1269503

You can use window functions. Assuming that the "29" would always be the second to last record, you can do:

select t.*
from (select t.*,
             row_number() over (partition by variable8 order by datetime desc) as seqnum,
             lag(CallDisposition) over (partition by variable8 order by datetime asc) as prev_CallDisposition
      from t
     ) t
where ( prev_CallDisposition = 29 and newTransaction = 'N' and seqnum = 2)  or
      ( prev_CallDisposition <> 29 or prev_CallDisposition is null or newTransaction <> 'N' and seqnum = 1 );

Assuming the 29 call can be anywhere is only slightly trickier using window functions:

select t.*
from (select t.*,
             sum(case when CallDisposition = 29 and next_newTransaction = 'N' then 1 else 0 end) over (partition by variable8) as num_matching_29
      from (select t.*,
                   row_number() over (partition by variable8 order by datetime desc) as seqnum,
                   lead(newTransaction) over (partition by variable8 order by datetime asc) as next_newTransaction
            from t
           ) t
     ) t
where (CallDisposition = 29 and next_newTransaction = 'N') or
      (num_matching_29 = 0 and seqnum = 1);

Upvotes: 2

Related Questions