Reputation: 89
I need to select duplicate rows based on two columns in a join, and i can't seem to figure out how that is done.
Currently i got this:
SELECT s.name,administrative_site_id as adm_id,s.external_code,si.identifier_value
FROM suppliers s
INNER JOIN suppliers_identifier si
ON s.id = si.supplier_id
And the output is something along the lines of below:
| Name | adm_id | external_code |identifier_value |
|:-----------|------------:|:------------: |:----------------:|
| Warlob | 66323 | ext531 | id444 |
| Ozzy | 53123 | ext632 | id333 |
| Motorhead | 521 | ext733 | id222 |
| Perez | 123 | ext833 | id111 |
| Starlight | 521 | ext934 | id222 |
| Aligned | 123 | ext235 | id111 |
What i am looking for, is how to simply select these 4 rows, as they are duplicates based on column: adm_id and Identifier_value
| Name | adm_id | external_code |identifier_value |
|:-----------|------------:|:------------: |:----------------:|
| Motorhead | 521 | ext733 | id222 |
| Perez | 123 | ext833 | id111 |
| Starlight | 521 | ext934 | id222 |
| Aligned | 123 | ext235 | id111 |
Upvotes: 1
Views: 1264
Reputation: 1612
Or an alternate way that may perform better on big datasets:
with t as (
SELECT s.name,administrative_site_id as adm_id,s.external_code,si.identifier_value
COUNT(*) OVER (PARTITION BY administrative_site_id ,identifier_value ) AS cnt
FROM suppliers s
INNER JOIN suppliers_identifier si
ON s.id = si.supplier_id)
select name, adm_id, external_code, identifier_value
from t
where cnt > 1
Upvotes: 0
Reputation: 1195
First group by ADM_ID, IDENTIFIER_VALUE and find groups that has more than one row in it. Then select all rows that has these couples
SELECT S.NAME
,ADMINISTRATIVE_SITE_ID AS ADM_ID
,S.EXTERNAL_CODE
,SI.IDENTIFIER_VALUE
FROM SUPPLIERS S INNER JOIN SUPPLIERS_IDENTIFIER SI ON S.ID = SI.SUPPLIER_ID
WHERE (ADMINISTRATIVE_SITE_ID, SI.IDENTIFIER_VALUE) IN (SELECT ADMINISTRATIVE_SITE_ID AS ADM_ID, SI.IDENTIFIER_VALUE
FROM SUPPLIERS S INNER JOIN SUPPLIERS_IDENTIFIER SI ON S.ID = SI.SUPPLIER_ID
GROUP BY ADM_ID, IDENTIFIER_VALUE
HAVING COUNT(*) > 1)
Upvotes: 3