Reputation: 203
i need a SQL-Query to delete duplicates from a table. Lets start with my tables
rc_document: (there are more entries, this is just an example)
+----------------+-------------+----------------------+
| rc_document_id | document_id | rc_document_group_id |
+----------------+-------------+----------------------+
| 1 | 1 | 1 |
| 2 | 2 | 1 |
| 3 | 3 | 1 |
| 4 | 4 | 1 |
| 5 | 1 | 2 |
| 6 | 3 | 2 |
+----------------+-------------+----------------------+
(document_id can be exists in mulitple rc_document-group´s)
rc_document_group:
+----------------------+----------+
| rc_document_group_id | priority |
+----------------------+----------+
| 1 | 1 |
| 2 | 2 |
+----------------------+----------+
Each rc_document can be joined with the rc_document_group. In the rc_document_group is the priority for each rc_document.
I want to delete the rc_document rows with document_id which have not the highest priority in the rc_document_group. Because the document_id can be exists in multiple rc_document-group´s .. i just want to keep that one, with the highest priority.
here is my expected rc_document table after deleting duplicate document_id´s:
+----------------+-------------+----------------------+
| rc_document_id | document_id | rc_document_group_id |
+----------------+-------------+----------------------+
| 2 | 2 | 1 |
| 4 | 4 | 1 |
| 5 | 1 | 2 |
| 6 | 3 | 2 |
+----------------+-------------+----------------------+
the rc_document´s with rc_document_id 1 and 3 must be deleted, because there document_id 1 and 3 are in another rc_document_group with higher priority.
Im new in sql and i have no idea how to write these sql query ... thank for your help!!
Upvotes: 1
Views: 71
Reputation: 926
First, you could join the two tables in order to get the corresponding priority on each row. After that, you could use the analytic function MAX() to get, for each row, the max priority within each group of document_id. At this point, you filter out the rows where the priority is not equal to the max priority in the group.
Try this query:
SELECT t.rc_document_id,
t.document_id,
t.rc_document_group_id
FROM (SELECT d.*,
g.priority,
MAX(g.priority) OVER(PARTITION BY document_id) max_priority
FROM rc_document d
INNER JOIN rc_document_group g
ON d.rc_document_group_id = g.rc_document_group_id) t
WHERE t.priority = t.max_priority
Upvotes: 1