Reputation: 6081
First of all I've been on this for 2 hours and I did full research before posting it here. There are similar questions to this here on SO but none of them matches my problem (So please don't mark it as duplicate).
I have a huge table with more than ~100000 records and what I'm trying to do is to delete certain duplicate rows.
Example I've following rows:
ID NAME NUMBER
1 'NIL' 1234
2 'NIL' 1234
3 'NIL' 1234
4 'MES' 5989
5 'NL' 1235
6 'NL' 12385
7 'NL' 1235
8 'MES' 5989
Expected output (which ID to keep or which ID to delete doesn't matter):
ID NAME NUMBER
1 'NIL' 1234
2 'NIL' 1234
3 'NIL' 1234
4 'MES' 5989
5 'NL' 1235
6 'NL' 12385
I've tried this:
DELETE FROM tableA
WHERE ID NOT IN
(
SELECT ID From (SELECT * FROM tableA) AS t1
WHERE PHONE NOT IN (1235,5989)
GROUP BY NAME, NUMBER
UNION
SELECT MIN(ID) From (SELECT * FROM tableA) AS t2
WHERE PHONE IN (1235,5989)
GROUP BY NAME, NUMBER
)
The above query works perfects on test data of 10 or 20 rows but when applied to by live DB it takes ages and keeps fetching the rows endlessly (the query has been running since more than 40 mins now) I don't know what I'm doing wrong. Any help would be greatly appreciated.
Upvotes: 0
Views: 69
Reputation: 381
CREATE TABLE tableB like tableA;
INSERT INTO tableB (SELECT * FROM tableA GROUP BY name,number);
RENAME TABLE tableA to tableA_with_dups, tableB to tableA;
*note that this is not necessarily the best solution, depending on if this is a running system, table indexing, etc. If you have more requirements just add to the comments and i'll add in a better fitted answer.
*also, could you please run this for us: SHOW CREATE TABLE tableA; then we can leverage indexing properly.
Upvotes: 1
Reputation: 3026
I would put the inner clause in a temp table and select from that.
CREATE TEMPORARY TABLE IF NOT EXISTS IdsToKeep AS
(SELECT ID From (SELECT * FROM tableA) AS t1
WHERE PHONE NOT IN (1235,5989)
GROUP BY NAME, NUMBER
UNION
SELECT MIN(ID) From (SELECT * FROM tableA) AS t2
WHERE PHONE IN (1235,5989)
GROUP BY NAME, NUMBER)
DELETE FROM tableA
WHERE ID NOT IN
(SELECT ID FROM IdsToKeep)
Unfortunately, I'm not that good with MySql, so the syntax may be a little bit off.
Upvotes: 0