stackuser
stackuser

Reputation: 672

Duplicates rows details in T-SQL

I am trying to write a query that will give me the details of all duplicate invoices per vendor.

I cannot use the group by since I need all the details of the invoices. So far this is what I have tried

 select 
     vendid, InvcNbr as InvcNbr, InvcDate, OrigDocAmt, PayDate,
     dense_RANK() over (partition by vendid order by invcnbr) RN
 from APDoc
 where InvcDate >= '10/01/2013'

Not sure how to proceed from here.

vendid   InvcNbr          InvcDate  OrigDoc   Paydate    RN
AAA  1067458361        10/2/2013     0.00   11/1/2013      8
AAA 1067461099         10/2/2013    16.08   11/1/2013      9
AAA 1067461099          10/2/2013   16.08   11/1/2013      9
AAA 1067461101          10/2/2013   16.08   11/1/2013     10
AAA 1067461101          10/2/2013   16.08   11/1/2013     10
AAA 1067461102          10/2/2013   16.08   11/1/2013     11
AAA 1067461102          10/2/2013   16.08   11/1/2013     11
AAA 1067461103          10/2/2013   92.45   11/1/2013     12
AAA 1067461103          10/2/2013   92.45   11/1/2013     12

Upvotes: 2

Views: 95

Answers (4)

philofinfinitejest
philofinfinitejest

Reputation: 4037

Assuming you have a primary key on the table, you can execute this (potentially) very quickly by using the EXISTS clause

select * from APDoc a1 where exists 
  (
    select 1 from APDoc a2 
    where a1.pk <> a2.pk
    and a1.vendid = a2.vendid and a1.invcnbr = a2.invcnbr
  )
and InvcDate >= '10/01/2013'
order by vendid , invcnbr

This allows the query optimizer to generate a plan that does not require aggregation which in a very large table with many duplicates will be expensive.

Upvotes: 0

James
James

Reputation: 12796

Use Group By with the Having clause to identify the duplicates, and then join these results to an outer query to see the details of the duplicates.

Here is an example of how you can do this.

SELECT a.vendid,a.InvcNbr as InvcNbr,a.InvcDate,a.OrigDocAmt,a.PayDate
FROM APDoc a
JOIN (
  SELECT vendid, InvcNbr
  FROM APDoc
  WHERE InvcDate >= '10/01/2013'
  GROUP BY vendid,InvcNbr HAVING COUNT(*) > 1
) b ON a.vendid = b.vendid AND a.InvcNbr = b.InvcNbr

Upvotes: 4

rwking
rwking

Reputation: 1032

This also works and might be easier to understand.

select InvcNbr, COUNT(InvcNbr) as [count]
into #temp1 
from #APDoc
group by InvcNbr

select a.vendid, a.InvcNbr, a.InvcDate, a.OrigDoc, a.Paydate, a.RN
from APDoc a, #temp1 b
where a.InvcNbr = b.InvcNbr
and b.[count] = 2

Upvotes: 0

ImGreg
ImGreg

Reputation: 2983

Something like this using Common-Table-Expressions could build the query up as required.

WITH TempCTE AS (SELECT InvcNbr, vendid, ROW_NUMBER() OVER (PARITION
BY vendid, InvcNbr order by invcnbr ) AS RowNum  FROM APDoc),

// Find all combinations of InvcNbr/vendid exist 
TempCTE2 AS  (SELECT InvcNbr, vendid FROM TempCTE WHERE RowNum > 1)

// Get all the combinations of InvcNbr/vendid 
SELECT * FROM TempCTE2
INNER JOIN APDoc ON TempCTE2.InvcNbr = APDoc.InvcNbr 
AND APDoc.vendid = TempCTE2.vendid

Upvotes: 0

Related Questions