Mariadb(MySQL)：Query slow when using sub query

Question

original sql：

select * from A
join B on A.aid = b.aid
...(some join)
limit 0,1000

it will return results in 1s, but when using subquery, it was became really slow(in minutes)：

select * from
(
select * from A
join B on A.aid = b.aid
...(some join)
) tmp
limit 0,1000

when I execute with explain, it has a additional row：

select_type: PRIMARY
table: 
type: ALL
possible_keys: null
key_len: null
rows: 1504940
Extra: Using temporary; Using filesort

MySQL version：5.6

ps. I need to use the query in subquery way, because it is happens in a view read by another application.

Yuseferi · Accepted Answer

The first approach, with joins, is by far faster. In second the query will be executed for each row. Some databases optimize nested queries into joins though.

Join vs. sub-query

Article MySQL performance: INNER JOIN vs. sub-select

I found using a 'virtual table' instead of a ROW subquery is much faster on my table. It seems that the row subquery isn't optimized, where the join over the 'virtual table' is optimized.

Below are the queries and the 'EXPLAIN' returned for educational purposes.

-- Query using ROW subquery

EXPLAIN
SELECT 
* 
FROM 
region
WHERE 
ROW (PDB,CHAIN) IN (
SELECT 
region.PDB, 
region.CHAIN
FROM 
region LEFT JOIN split_domain USING (SUNID)
WHERE
split_domain.SUNID IS NULL
GROUP BY
PDB, CHAIN
HAVING
COUNT(*)>1
)
LIMIT 
10
;

+----+--------------------+--------------+--------+---------------+---------+---------+------------------------+-------+--------------------------------------+
| id | select_type        | table        | type   | possible_keys | key     | key_len | ref                    | rows  | Extra                                |
+----+--------------------+--------------+--------+---------------+---------+---------+------------------------+-------+--------------------------------------+
|  1 | PRIMARY            | region       | ALL    | NULL          | NULL    |    NULL | NULL                   | 57362 | Using where                          |
|  2 | DEPENDENT SUBQUERY | region       | ALL    | NULL          | NULL    |    NULL | NULL                   | 57362 | Using temporary; Using filesort      |
|  2 | DEPENDENT SUBQUERY | split_domain | eq_ref | PRIMARY       | PRIMARY |       3 | scop_1_65.region.SUNID |     1 | Using where; Using index; Not exists |
+----+--------------------+--------------+--------+---------------+---------+---------+------------------------+-------+--------------------------------------+
3 rows in set (0.04 sec)

I can't get any results out of the above (takes too long) - Perhaps the limit clause isn't kicking in?

-- Query using joined virtual table

EXPLAIN
SELECT 
* 
FROM 
region
INNER JOIN (
SELECT 
region.PDB, 
region.CHAIN
FROM 
region LEFT JOIN split_domain USING (SUNID)
WHERE
split_domain.SUNID IS NULL
GROUP BY
PDB, CHAIN
HAVING
COUNT(*)>1
) AS x
ON
region.PDB = x.PDB 
AND
region.CHAIN = x.CHAIN
LIMIT 
10
;

+----+-------------+--------------+--------+---------------------+-----------+---------+------------------------+-------+--------------------------------------+
| id | select_type | table        | type   | possible_keys       | key       | key_len | ref                    | rows  | Extra                                |
+----+-------------+--------------+--------+---------------------+-----------+---------+------------------------+-------+--------------------------------------+
|  1 | PRIMARY     |    | ALL    | NULL                | NULL      |    NULL | NULL                   |  8624 |                                      |
|  1 | PRIMARY     | region       | ref    | PDB,CHAIN,pdb_chain | pdb_chain |       5 | x.PDB,x.CHAIN          |     1 |                                      |
|  2 | DERIVED     | region       | ALL    | NULL                | NULL      |    NULL | NULL                   | 57362 | Using temporary; Using filesort      |
|  2 | DERIVED     | split_domain | eq_ref | PRIMARY             | PRIMARY   |       3 | scop_1_65.region.SUNID |     1 | Using where; Using index; Not exists |
+----+-------------+--------------+--------+---------------------+-----------+---------+------------------------+-------+--------------------------------------+
4 rows in set (1.02 sec)

The above returns ... 10 results in about 1 second 100 results in about 1 second 1000 results in about 1.5 seconds Full set (20437) in about 2 seconds

The former query doesn't return in 5 mins (even with limit 10).

I hope this is useful to anyone designing (or trying to optimize) complex subqueries, and that the precise details of the data are not necessary to convey the results presented here.

Mariadb(MySQL)：Query slow when using sub query

Answers (2)

Related Questions