Reputation: 497
I have an arbitrarily large MySQL table where there are duplicated rows, however to determine which rows are duplicated I need to match the data from two columns. A modified snippet of the table is below.
mysql> select * from DATA_STATUS where METADATA_ID='6ac00785-abcd-3f4a-defg-12b8ed23abff';
+--------+------------+--------------------------------------+-------------+
| ID | STATUS | METADATA_ID | METADATA_FK |
+--------+------------+--------------------------------------+-------------+
| 1 | 3 | 6ac00785-abcd-3f4a-defg-12b8ed23abff | 1234 |
+--------+------------+--------------------------------------+-------------+
| 2 | 3 | 6ac00785-abcd-3f4a-defg-12b8ed23abff | 1234 |
+--------+------------+--------------------------------------+-------------+
| 3 | 0 | 6ac00785-abcd-3f4a-defg-12b8ed23abff | 1234 |
+--------+------------+--------------------------------------+-------------+
| 4 | 0 | 6ac00785-abcd-3f4a-defg-12b8ed23abff | 1234 |
+--------+------------+--------------------------------------+-------------+
| 5 | 1 | 6ac00785-abcd-3f4a-defg-12b8ed23abff | 1234 |
+--------+------------+--------------------------------------+-------------+
| 6 | 2 | 6ac00785-abcd-3f4a-defg-12b8ed23abff | 1234 |
+--------+------------+--------------------------------------+-------------+
I want to do a select on the entire table where there are multiple of the same METADATA_ID
where the duplicated METADATA_ID
rows have a STATUS
of 3. I know how to query a table for duplicates in one column, but am struggling to figure out how to match on duplicates and other conditions.
From the example data, the row IDs that match this condition are 1 and 2 but not 3.
EDIT: Additional information for clarification and TL;DR conditions
The overall criteria for a duplicate is STATUS=3
and METADATA_ID > 1
, the snippet below shows the rows that meet this.
+--------+------------+--------------------------------------+-------------+
| ID | STATUS | METADATA_ID | METADATA_FK |
+--------+------------+--------------------------------------+-------------+
| 1 | 3 | 6ac00785-abcd-3f4a-defg-12b8ed23abff | 1234 |
+--------+------------+--------------------------------------+-------------+
| 2 | 3 | 6ac00785-abcd-3f4a-defg-12b8ed23abff | 1234 |
+--------+------------+--------------------------------------+-------------+
I want the query to either pull back just one row that contains the ID
, STATUS
and METADATA_ID
(METADATA_FK
is optional) when a duplicate is found, or all instances of the duplication, either is fine. The data is not counted as a duplicate if STATUS
is not 3 or the METADATA_ID
only exists once in the table.
Upvotes: 2
Views: 72
Reputation: 21533
Assuming you want all the records which have those fields duplicated:-
SELECT some_table.ID,
some_table.STATUS,
some_table.METADATA_ID,
some_table.METADATA_FK
FROM
(
SELECT STATUS,
METADATA_ID,
METADATA_FK
FROM some_table
WHERE status_id = 3
GROUP BY STATUS, METADATA_ID, METADATA_FK
HAVING COUNT(*) > 1
) sub0
INNER JOIN some_table
ON sub0.STATUS = some_table.STATUS
AND sub0.METADATA_ID = some_table.METADATA_ID
AND sub0.METADATA_FK = some_table.METADATA_FK
I have assumed that metafata_fk is part of the uniqueness of a record
Upvotes: 1
Reputation: 1513
Try this:
select *
from yourtable
where
status_id = 3 and
metadata_id in (
select metadata_id
from yourtable
where status_id = 3
group by metadata_id
having count(*) > 1
);
If not all rows are necessary you can use this simple query:
select * from yourtable where status_id = 3 group by metadata_id having count(*) > 1;
Upvotes: 1