Reputation: 101
I have a table which has duplicate values in one of the columns. I am trying to eliminate rows with duplicate entries based on some logic described below.
Here is the table example -
RecoveryKey DateTime Duration CallDisposition NewTransaction Variable8 CallTypeID
7994113912466 12/4/18 16:26 19 52 Y 152643-5657 -1
7994113912470 12/4/18 16:26 1168 29 Y 152643-5657 6390
7994113912751 12/4/18 16:51 2686 13 N 152643-5657 6390
7994113912756 12/4/18 13:51 56 2 Y 152643-5658 6390
7994113912756 12/4/18 13:52 125 13 Y 152643-5658 6390
7994113912756 12/4/18 13:55 125 13 N 152643-5658 6390
The column Variable8
has duplicate values for the first 3 records and for the last 3 records.
What do I want to do is, if CallDisposition
is 29 and the next entry after that has a NewTransaction
value of "N" then I have to keep the row with CallDisposition
= 29.
For all other instance I want to select latest row irrespective of CallDisposition
or NewTransaction
value.
So my final result should look like this
RecoveryKey DateTime Duration CallDisposition NewTransaction Variable8 CallTypeID
7994113912470 12/4/18 16:26 1168 29 Y 152643-5657 6390
7994113912756 12/4/18 13:55 125 13 N 152643-5658 6390
Upvotes: 0
Views: 254
Reputation: 15155
I took the liberty of adding a sequence as I assume there is some order. This will pick up the 29/N combo anywhere in the data.
DECLARE @T TABLE (datetime INT , CallDiusposition INT, NewTransaction NVARCHAR(10), Variable8 NVARCHAR(50))
INSERT @T VALUES(1,52,'Y',5657),(2,29,'Y',5657),(3,13,'N',5657),(4,2,'Y',5658),(5,13,'Y',5658),(6,13,'N',5658)
;WITH A AS
(
SELECT *,
RN = CASE WHEN LEAD(NewTransaction) OVER (PARTITION BY Variable8 ORDER BY datetime) ='N' AND CallDiusposition=29 THEN -1 ELSE ROW_NUMBER() OVER (PARTITION BY Variable8 ORDER BY datetime) END
FROM @T
),
B AS
(
SELECT Match=ROW_NUMBER() OVER (PARTITION BY Variable8 ORDER BY RN ASC),* FROM A
)
SELECT * FROM B WHERE Match=1
Without CTE
SELECT * FROM
(
SELECT Match=ROW_NUMBER() OVER (PARTITION BY Variable8 ORDER BY RN ASC),* FROM
(
SELECT *,
RN = CASE WHEN LEAD(NewTransaction) OVER (PARTITION BY Variable8 ORDER BY datetime) ='N' AND CallDiusposition=29 THEN -1 ELSE ROW_NUMBER() OVER (PARTITION BY Variable8 ORDER BY datetime) END
FROM @T
)AS A
)AS B
WHERE Match=1
Upvotes: 0
Reputation: 1269503
You can use window functions. Assuming that the "29" would always be the second to last record, you can do:
select t.*
from (select t.*,
row_number() over (partition by variable8 order by datetime desc) as seqnum,
lag(CallDisposition) over (partition by variable8 order by datetime asc) as prev_CallDisposition
from t
) t
where ( prev_CallDisposition = 29 and newTransaction = 'N' and seqnum = 2) or
( prev_CallDisposition <> 29 or prev_CallDisposition is null or newTransaction <> 'N' and seqnum = 1 );
Assuming the 29 call can be anywhere is only slightly trickier using window functions:
select t.*
from (select t.*,
sum(case when CallDisposition = 29 and next_newTransaction = 'N' then 1 else 0 end) over (partition by variable8) as num_matching_29
from (select t.*,
row_number() over (partition by variable8 order by datetime desc) as seqnum,
lead(newTransaction) over (partition by variable8 order by datetime asc) as next_newTransaction
from t
) t
) t
where (CallDisposition = 29 and next_newTransaction = 'N') or
(num_matching_29 = 0 and seqnum = 1);
Upvotes: 2