SQL query find all entries which match ALL other entries

Question

Probably be better if I explain via data :)

Say if I have a table

ItemId  Identifier1 Identifier2
1       'Test 1'    NULL
1       'Test 2'    NULL

So ItemId 1 is identified by 'Test 1' AND 'Test 2'. This could be a combination of Identifer1 or Identifier 2 e.g.

ItemId  Identifier1 Identifier2
1       'Test 1'    NULL
1        NULL       1

In this case ItemId 1 is identified by 'Test 1' and the number 1. And Item can have as many rows to identify as is possible.

I want to find all records (in a table with same structure as above) which have duplicate identifiers.

My current incorrect approach validates with just one row, so

ItemId  Identifier1 Identifier2
1       'Test 1'    NULL
1       'Test 2'    NULL

and

ItemId  Identifier1 Identifier2
2       'Test 1'    NULL
2       'Test 3'    NULL

are coming back as duplicate records because of the 'Test 1' match. This is incorrect. It needs to check all rows.

I thought about using an 'ALL' but that didn't help! I imagine I would need to use some magic with an aggregate/window function, but I cannot come up with the correct query.

Please help!

Gordon Linoff · Accepted Answer

You want to find all items where all rows for the item match exactly the same rows for another item.

This is more complicated because you have two columns identifying each row and each NULLs are important. But not too complicated:

SELECT t1.itemid, t2.itemid
FROM (SELECT t.*, COUNT(*) OVER (PARTITION BY itemid) AS cnt
      FROM table t
     ) t1 FULL OUTER JOIN
     (SELECT t.*, COUNT(*) OVER (PARTITION BY itemid) AS cnt
      FROM table t
     ) t2
     ON t1.itemId <> t2.itemId AND t1.cnt = t2.cnt AND
        (t1.identifier1 = t2.identifier1 OR t1.identifier1 IS NULL AND t2.identifier1 IS NULL) AND
        (t1.identifier2 = t2.identifier2 OR t1.identifier2 IS NULL AND t2.identifier2 IS NULL) 
GROUP BY t1.itemid, t2.itemid
HAVING COUNT(*) = COUNT(t1.itemid) AND COUNT(*) = COUNT(t2.itemid);

SQL query find all entries which match ALL other entries

Answers (1)

Related Questions