Jacobs
Jacobs

Reputation: 3

How to check if all records for a particular date are duplicated?

I have an issue where all records for particular dates are duplicated in a sales table.
I need to be able to filter the table to find which dates were duplicated.

I know how to get duplicates by using the below query fine.

SELECT COUNT(*) AS AmtOfDuplicates,
[DATE], CODE, QTY, TRANSACTION_NO
FROM SALES 
WHERE [DATE] >= '2023-01-01'
GROUP BY [DATE], CODE, QTY, TRANSACTION_NO
HAVING COUNT(*) > 1
ORDER BY [DATE], CODE, QTY, TRANSACTION_NO

The issue with this is that some duplicates are valid, ie. a certain transaction can have the same item and qty scanned more than once for a transaction, so thousands more records are returned.

Below is some sample data to explain further.

DATE CODE QTY TRANS_NO
2/2/2023 123 1 1
2/2/2023 XYZ 2 1
13/3/2023 123 1 2
13/3/2023 XYZ 2 2
13/3/2023 123 1 2
15/4/2023 123 4 3
15/4/2023 XYZ 3 4
15/4/2023 ABC 1 4
15/4/2023 123 4 3
15/4/2023 XYZ 3 4
15/4/2023 ABC 1 4

As you can see above, the duplicate of item 123 on 13/3/2023 is valid as the item was scanned more than once for that transaction, and we know that date is not duplicate as not all records for that date are twice. However you can see that every record for 15/4/2023 is twice, so that date was duplicate and I just need a way to find that date.

My query above would return:

AmtOfDuplicates DATE CODE QTY TRANS_NO
2 13/3/2023 123 1 2
2 15/4/2023 123 4 3
2 15/4/2023 ABC 1 4
2 15/4/2023 XYZ 3 4

What I need is just the date that has every record for that date twice:

DATE DUPLICATE
15/4/2023 YES

Upvotes: 0

Views: 65

Answers (2)

siggemannen
siggemannen

Reputation: 9287

You can do something like this:

SELECT  date
FROM    (
    SELECT  *
    ,   COUNT(*) OVER(PARTITION BY date, code, qty, trans_no) AS cnt
    FROM    (
        VALUES  (N'2/2/2023', N'123', 1, 1)
        ,   (N'2/2/2023', N'XYZ', 2, 1)
        ,   (N'13/3/2023', N'123', 1, 2)
        ,   (N'13/3/2023', N'XYZ', 2, 2)
        ,   (N'13/3/2023', N'123', 1, 2)
        ,   (N'15/4/2023', N'123', 4, 3)
        ,   (N'15/4/2023', N'XYZ', 3, 4)
        ,   (N'15/4/2023', N'ABC', 1, 4)
        ,   (N'15/4/2023', N'123', 4, 3)
        ,   (N'15/4/2023', N'XYZ', 3, 4)
        ,   (N'15/4/2023', N'ABC', 1, 4)
    ) t (DATE,CODE,QTY,TRANS_NO)
    ) x
GROUP BY date
HAVING max(cnt) = 2
AND min(cnt) = 2

Since you want to check so all rows match your criteria, it's usually simplest to create a counter by use of a window aggregate function in the first step, and then you can use the "real" group by to make sure all rows for a particular date match your condition.

Upvotes: 0

Joel Coehoorn
Joel Coehoorn

Reputation: 416149

For step one, do a query to show rows that are definitely unique:

SELECT DATE, CODE, QTY, TRANS_NO
FROM Sales
GROUP BY DATE, CODE, QTY, TRANS_NO
HAVING COUNT(*) = 1

This will exclude some valid (non-duplicate) rows, such as the original 123 -- 13/3/2023 example. That's okay, because we really only care about the date. Logically, rows matching any of the dates returned by this query are NOT duplicates.

So for step 2, let's group by these dates and select only that column:

SELECT DATE
FROM (
    SELECT DATE, CODE, QTY, TRANS_NO
    FROM Sales
    GROUP BY DATE, CODE, QTY, TRANS_NO
    HAVING COUNT(*) = 1
) d
GROUP BY DATE

Finally, for step 3, do a NOT IN(), EXCLUSION JOIN, or NOT EXISTS() to remove these rows:

Option 1 (Exclusion):

SELECT *
FROM Sales s
LEFT JOIN (
    SELECT DATE
    FROM (
        SELECT DATE, CODE, QTY, TRANS_NO
        FROM Sales
        GROUP BY DATE, CODE, QTY, TRANS_NO
        HAVING COUNT(*) = 1
    ) d
    GROUP BY DATE
) dt ON dt.DATE = s.DATE
WHERE dt.DATE IS NULL

Option 2 NOT EXISTS():

SELECT *
FROM Sales s
WHERE NOT EXISTS (
    SELECT 1 
    FROM (
        SELECT DATE
        FROM (
            SELECT DATE, CODE, QTY, TRANS_NO
            FROM Sales
            GROUP BY DATE, CODE, QTY, TRANS_NO
            HAVING COUNT(*) = 1
        ) d
        GROUP BY DATE
    ) dt
    WHERE dt.DATE = s.DATE
)

And now only duplicates will be shown.

Upvotes: 0

Related Questions