Conor Cosnett
Conor Cosnett

Reputation: 1346

How to delete rows from table1 that are present in table2 (MySQL)?

Consider MySQL tables: table1, table2

table1:

+------+------+
| col1 | col2 |
+------+------+
|    1 | a    |
|    2 | b    |
|    3 | c    |
+------+------+

table2:

+------+------+
| col1 | col2 |
+------+------+
|    1 | a    |
|    2 | b    |
+------+------+

What is the most efficient way to delete the rows in table1 based on the rows in table2 such that the desired output looks like this:

+------+------+
| col1 | col2 |
+------+------+
|    3 | c    |
+------+------+

Please note that this is a minimalist example of a problem I am having with two very large tables:

Here is code to create table1 and table2:

DROP TABLE IF EXISTS table1;
CREATE TABLE table1 (
    col1 BIGINT,
    col2 TEXT
);
INSERT INTO table1 VALUES (1, 'a');
INSERT INTO table1 VALUES (2, 'b');
INSERT INTO table1 VALUES (3, 'c');
DROP TABLE IF EXISTS table2;
CREATE TABLE table2 (
    col1 BIGINT,
    col2 TEXT
);
INSERT INTO table2 VALUES (1, 'a');
INSERT INTO table2 VALUES (2, 'b');

MySQL = 5.7.12

Question:

From reading this site and others I notice that there are several ways to do this operation in MySQL. I am wondering which is the fastest way for large tables (30M+ rows)? Here are some ways I have discovered:

1. method using DELETE

DELETE t1
FROM table1 t1
INNER JOIN table2 t2
ON t1.col1=t2.col1;

2. method using DELETE FROM

 DELETE FROM t1
 USING table1 t1 
 INNER JOIN table2 t2 
 ON ( t1.col1 = t2.col1 );

3. method using DELETE FROM

 DELETE FROM table1 WHERE col1 in (SELECT col1 FROM table2);

Is there a faster way to do this that I have not listed here?

Upvotes: 1

Views: 1354

Answers (2)

Rick James
Rick James

Reputation: 142540

CREATE TABLE t_new LIKE t
INSERT INTO t_new
    SELECT *
        FROM t
        LEFT JOIN exclude ON ...
        WHERE exclude.id IS NULL;
RENAME TABLE t TO t_old,
             t_new TO t;
DROP TABLE t_old;

DELETE (and UPDATE) choke on handling a huge number of rows; SELECT does not.

A possible optimization on this would be to drop all indexes except the PRIMARY KEY and re-add them after finishing.

(FOREIGN KEYs can be a big nuisance; do you have any?)

Upvotes: 1

Ergest Basha
Ergest Basha

Reputation: 9048

I will suggest another method it is not as practical as the mentioned method , but maybe it will be much faster for larger tables.

It is mentioned on [MySQL documentation] (https://dev.mysql.com/doc/refman/8.0/en/delete.html)

InnoDB Tables
If you are deleting many rows from a large table, you may exceed the lock table size for an InnoDB table. To avoid this problem, or

simply to minimize the time that the table remains locked, the following strategy (which does not use DELETE at all) might be helpful:

Select the rows not to be deleted into an empty table that has the same structure as the original table:

INSERT INTO t_copy SELECT * FROM t WHERE ... ;
Use RENAME TABLE to atomically move the original table out of the way and rename the copy to the original name:

RENAME TABLE t TO t_old, t_copy TO t;
Drop the original table:

DROP TABLE t_old;

--Follow below steps:

--Rename the table:

RENAME TABLE table1 TO table1_old;

--Create new table with primary key and all necessary indexes:

CREATE TABLE table1 LIKE table1_old;

USE THIS FOR MyISAM TABLES:

SET UNIQUE_CHECKS=0;
LOCK TABLES table1_old WRITE, table2 WRITE;
        ALTER TABLE table1 DISABLE KEYS;
            INSERT INTO table1 (select * from table1_old t1 where  col1 not in  (select col1 from table2 )); 
        ALTER TABLE table1 ENABLE KEYS;
SET UNIQUE_CHECKS=1;
UNLOCK TABLES;

-- USE THIS FOR InnoDB TABLES:

SET AUTOCOMMIT = 0; 
SET UNIQUE_CHECKS=0; 
SET FOREIGN_KEY_CHECKS=0;
LOCK TABLES table1_old WRITE, table2 WRITE;
    INSERT INTO table1 (select * from table1_old t1 where  col1 not in  (select col1 from table2 )); 
SET FOREIGN_KEY_CHECKS=1; 
SET UNIQUE_CHECKS=1; 
COMMIT; SET AUTOCOMMIT = 1;
UNLOCK TABLES;

Upvotes: 1

Related Questions