Reputation: 651
I have two tables which are defined as below:
CREATE TABLE `a` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`cid` varchar(100) CHARACTER SET utf8 COLLATE utf8_general_ci NOT NULL DEFAULT '',
`data` varchar(1024) CHARACTER SET utf8 COLLATE utf8_general_ci NOT NULL DEFAULT '',
`update_time` timestamp(0) NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP(0),
PRIMARY KEY (`id`) USING BTREE
) ENGINE = InnoDB AUTO_INCREMENT = 150 CHARACTER SET = utf8 COLLATE = utf8_general_ci ROW_FORMAT = Compact;
there are 10k rows in a and 200k rows in b, all the data were generated by a random function. Below is a sample:
151 8VE6BU06 8VE6BU06 2019-09-12 23:07:39
Here are three tests,
1. Case 1 costs 2.889s
SELECT cid FROM `a` WHERE a.cid not in (select b.cid from b);
execution plan
1 PRIMARY a ALL 10094 Using where
2 SUBQUERY b ALL 199826
2. case 2 costs 628.699s
delete from `a` WHERE a.cid not in (select b.cid from b);
execution plan
1 PRIMARY a ALL 10094 Using where
2 DEPENDENT SUBQUERY b ALL 199826 Using where
3. case 3 costs 0.036s
alter table b add index cid(cid);
delete from `a` WHERE a.cid not in (select b.cid from b);
execution plan
1 PRIMARY a ALL 10094 Using where
2 DEPENDENT SUBQUERY b index_subquery cid cid 302 func 1 Using index
Question
MySQL Version: 5.6.20 innodb
Row data in table b is like:
151 8VE6BU06 8VE6BU06 2019-09-12 23:07:39
Upvotes: 2
Views: 478
Reputation: 142278
DELETE
and UPDATE
differently (not as good) as SELECT
. A later version (5.7? or 8.0) does better.DELETE
must create undo records in case of a crash midway through the DELETE
; this is quite costly relative to just selecting.EXPLAINs
say ALL
and ALL
. this implies that "for each row in one table, it scans all the rows of the other table". The third explain is orders of magnitude faster because of the INDEX
.NOT IN ( SELECT ... )
was usually performed very poorly -- as in complete scans, etc.DELETE
" using a LEFT JOIN ... IS NULL
instead of the IN
approach.innodb_buffer_pool_size
is too small to hold all of b
, this would lead to a lot of I/O, hence the long time for query 2.cid
and data
, it is not practical to discuss the I/O needs further.Upvotes: 4
Reputation: 3535
Read is faster than write/delete as @shadow commented.
Refer to https://dev.mysql.com/doc/refman/5.5/en/subquery-optimization.html
Indexing
Indexing will give you great performance.
To illustrate in simple way, index is like page0 of document.
Page0 index
Mysql p1
Oracle p2
...
Postgresql p100
Page1
Some looooooong Mysql material
Page2
Some looooooong Oracle material
...
Page100
Some looooooong Postgresql material
Read with index
It takes relatively short time to find key within page0. For example, after finding postgresql in index and go to page 100, this way is much faster than reading hundreds pages.
Delete with index
You can delete only postgresql in page0 index. You don't have to erase all content in page100.
For more information, please refer to How does database indexing work?
Upvotes: 0
Reputation: 2116
Are these truly measured in seconds? That seems very excessive for such simple queries on moderate size tables.
That said, in general a DELETE
will take longer than a SELECT
for the same data. The database engine needs to assure that any foreign key references are not broken and update indexes etc. This behavior seems normal.
Upvotes: 1