goofyui
goofyui

Reputation: 3492

How to fetch duplicate records in the time difference of 5-10 mins?

SQL Database.

I will have to identify the list of duplicate transactions happened with in the range of 5- 6 mins.

BillId  Trans   DateTime

A100125  Paid  2018-04-18 11:21:40.873 - Valid Transaction
A100125  Paid  2018-04-18 11:24:40.873 - Duplicate Transaction
A100125  Paid  2018-04-18 11:30:40.873 - Duplicate Transaction
A100125  Paid  2018-04-18 12:30:40.873 - Valid Transaction

I can generate a report for date range. This is about average difference of 5-10 mins.

Upvotes: 1

Views: 499

Answers (4)

DxTx
DxTx

Reputation: 3357

It goes without saying that there are better ways to do this. But, anyway, here we go.

SELECT q1.Billid, q1.Trans, q1.TranTime , 
       CONVERT(INT, Isnull(Datediff(minute, q2.TranTime, q1.TranTime), 0)) AS [Difference], 
       CASE 
         WHEN CONVERT(INT, Isnull(Datediff(minute, q2.TranTime, q1.TranTime), 100)) > 6 THEN 'Valid'  -- (diff > 6) || (diff = 0)
         ELSE 'Invalid' 
       END as Validity
FROM   (SELECT *,   
       Row_number() OVER (ORDER BY [Billid], [Trans], TranTime DESC) AS rn 
       FROM   tablename2) q1 
       LEFT JOIN ((SELECT *, 
                   Row_number()  OVER (ORDER BY [Billid], [Trans], TranTime DESC) AS rn 
                   FROM   tablename2)) q2 
       ON q1.rn = q2.rn - 1 
ORDER  BY TranTime ASC 

Result

+---------+-------+-------------------------+------------+----------+
| Billid  | Trans |        TranTime         | Difference | validity |
+---------+-------+-------------------------+------------+----------+
| A100125 | Paid  | 2018-04-18 11:21:40.873 |          0 | Valid    |
| A100125 | Paid  | 2018-04-18 11:24:40.873 |          3 | Invalid  |
| A100125 | Paid  | 2018-04-18 11:30:40.873 |          6 | Invalid  |
| A100125 | Paid  | 2018-04-18 12:30:40.873 |         60 | Valid    |
| A100125 | Paid  | 2018-04-18 12:31:40.873 |          1 | Invalid  |
+---------+-------+-------------------------+------------+----------+

Upvotes: 0

KeithL
KeithL

Reputation: 5594

I stole the other guys set-up:

But here is an answer using lag:

declare @tabel Table (BillId varchar(8000),  Trans varchar(50), DateOfSomethingProbablyTheTransaction   DateTime)
declare @interval int = 6
insert into @tabel
values ('A100125',  'Paid',  '2018-04-18 11:21:40.873'), --valid
('A100125',  'Paid',  '2018-04-18 11:24:40.873'), -- Duplicate Transaction
('A100125',  'Paid',  '2018-04-18 11:30:40.873'), -- Duplicate Transaction
('A100125',  'Paid',  '2018-04-18 12:30:40.873') -- Valid Transaction)

(
select BillId,  Trans,   DateOfSomethingProbablyTheTransaction dt
    ,lag(DateOfSomethingProbablyTheTransaction,1) over (partition by BillID,Trans order by DateOfSomethingProbablyTheTransaction)
    ,case when datediff(minute
                       ,lag(DateOfSomethingProbablyTheTransaction,1) over (partition by BillID,Trans order by DateOfSomethingProbablyTheTransaction)
                       ,DateOfSomethingProbablyTheTransaction)<=6
        then 'Invalid'
        else 'Valid'
     end DataCheck
from @tabel
)

Results:

BillId  Trans   dt                      (No column name)        DataCheck
A100125 Paid    2018-04-18 11:21:40.873 NULL                    Valid
A100125 Paid    2018-04-18 11:24:40.873 2018-04-18 11:21:40.873 Invalid
A100125 Paid    2018-04-18 11:30:40.873 2018-04-18 11:24:40.873 Invalid
A100125 Paid    2018-04-18 12:30:40.873 2018-04-18 11:30:40.873 Valid

and again using CTE: (Not sure when LAG started)

;with base as
(
select BillId,  Trans,   DateOfSomethingProbablyTheTransaction dt
    ,rn = row_number() over (partition by BillId,  Trans order by DateOfSomethingProbablyTheTransaction)
from @tabel
)

select base.*
        ,prior.dt
    ,Test = case when datediff(minute,isnull(prior.dt,'1/1/1900'),base.dt) <=6 then 'Invalid' else 'Valid' end
from base
    left join base as [prior] on base.rn-1 = [prior].rn
            and base.BillId=[prior].BillId
            and base.Trans = [prior].trans

Upvotes: 1

SCFi
SCFi

Reputation: 522

Here is something

in short there is an exist subquery in a case statement comparing each row. This is not the most efficient as datediff is not sargable

declare @tabel Table (BillId varchar(8000),  Trans varchar(50), DateOfSomethingProbablyTheTransaction   DateTime)
declare @interval int = 6
insert into @tabel
values ('A100125',  'Paid',  '2018-04-18 11:21:40.873'), --valid
('A100125',  'Paid',  '2018-04-18 11:24:40.873'), -- Duplicate Transaction
('A100125',  'Paid',  '2018-04-18 11:30:40.873'), -- Duplicate Transaction
('A100125',  'Paid',  '2018-04-18 12:30:40.873') -- Valid Transaction)

select *, 
case when exists(select 1 
                from @tabel t2 
                where t1.BillId = t2.BillId 
                and t1.Trans = t2.Trans 
                and DATEDIFF(MINUTE, t2.DateOfSomethingProbablyTheTransaction, t1.DateOfSomethingProbablyTheTransaction) <= @interval
                and t1.DateOfSomethingProbablyTheTransaction > t2.DateOfSomethingProbablyTheTransaction)
 then 'Invalid' else'valid' end as Validity
 from @tabel t1

Also please note this can be rewritten as a join, if you want it to look nicer as the query optimizer is just doing a join behind the scene also if the trans status is irrelevant it can be removed for checking for duplicates

Upvotes: 2

Jonathan Shields
Jonathan Shields

Reputation: 120

I don't understand 100% what you need but this might help. This gives transactions on the same bill id with different times, where the time difference is less than 7 minutes.

select *
from Transactions T1
INNER JOIN Transactions T2
ON T1.BillId=T2.BillId
AND T1.DateTime<>T2.DateTime
AND DATEDIFF(MI,T1.DateTime,T2.DateTime)<7

If I have misunderstood let me know.

Upvotes: 0

Related Questions