Maximilian
Maximilian

Reputation: 127

How can I optimize this query with subquery?

I have the following MySQL query:

SELECT value_1, (
    SELECT value_4 
    FROM table_1
    WHERE value_3 < value_1 
    ORDER BY value_3 DESC  
    LIMIT 1
)
AS result_value 
FROM table_2 
WHERE value_1 BETWEEN 1325372400000 AND 1328050800000  
ORDER BY value_1

Which returns 32 results and takes 6.6 seconds to run. The idea is to get the entries from table_1 where value_3 is "closest" to value_1 in table_2.

The two queries the query is comprised of, namely

SELECT value_1
AS result_value 
FROM table_2 
WHERE value_1 BETWEEN 1325372400000 AND 1328050800000  
ORDER BY value_1

and (for example)

SELECT value_4 
FROM table_1
WHERE value_3 < 1328050800000 
ORDER BY value_3 DESC  
LIMIT 1

each take 0.03 seconds to run. Computing the accumulated time for 32 result sets, the compound query should take no longer than 1 second, or probably even less (because I/O overhead for individual queries is not accounted for). Yet at 6.6 seconds, it takes much longer.

Why is that, how can I optimize it? Or is there another/superior way to reach my goal?

Update:

Table definitions:

table_1 (MyISAM, 700000 entries):

'id', 'int(10) unsigned', 'NO', 'PRI', NULL, 'auto_increment'
'value_3', 'bigint(20) unsigned', 'NO', 'UNI', '0', ''
'value_4', 'bigint(20) unsigned', 'NO', '', '0', ''

table_2 (MyISAM, 4000 entries):

'value_1', 'bigint(20) unsigned', 'NO', 'PRI', NULL, ''

EXPLAIN [query]:

'1', 'PRIMARY', 'table_2', 'range', 'PRIMARY,value_3_UNIQUE', 'PRIMARY', '8', NULL, '32', 'Using where; Using index'
'2', 'DEPENDENT SUBQUERY', 'table_1', 'index', 'value_3,value_3_value_4', 'value_3', '8', NULL, '1', 'Using where'

Upvotes: 2

Views: 130

Answers (3)

ypercubeᵀᴹ
ypercubeᵀᴹ

Reputation: 115660

If you have an index on value_1 in table_2 and a compound (value3, value4) index on table_1, then only the indices will be used by the query.


You could also try this query:

SELECT value_1
     , value_4 AS result_value 
FROM table_2 
  JOIN table_1
    ON table_1.value_3 =
       ( SELECT value_3
         FROM table_1
         WHERE value_3 < value_1 
         ORDER BY value_3 DESC  
         LIMIT 1
       )
WHERE value_1 BETWEEN 1325372400000 AND 1328050800000  
ORDER BY value_1

Upvotes: 1

Bohemian
Bohemian

Reputation: 425448

Use a bit of mysql kung fu:

SELECT * from
(SELECT value_1, value_4
FROM table_2 
join table_1 on value_3 < value_1
WHERE value_1 BETWEEN 1325372400000 AND 1328050800000  
ORDER BY value_1, value_3 DESC) x
GROUP BY value_1 
ORDER BY value_1

The "kung fu" is that with mysql when the columns you're not grouping by are not aggregated (eg SUM() etc), you get the first row encountered for each group. If you use this technique on an ordered result set, you can get the value you're after.

Not only is this easier to code, but you'll notice that just one pass is made over the tables (not one query per row as you tried). It should perform well.


Edit:

There has been conjecture by some commenters that this special form of GROUP BY is "indeterminate" and/or "not officially supported" etc. The documentation states that the row selected in "indeterminate", however I have never seen nor heard of mysql selecting any row other than the first encountered and using an order by on the inner select is used and relied upon in countless production queries.

FWIW, I am happy to recommend this approach as being "reliable" and production-worthy.

Upvotes: 1

Angelin Nadar
Angelin Nadar

Reputation: 9320

Here Iam just avoiding use of virtual(derived) table. The main reason behind virtual table usage was sorting which has been achieved by applying order by clause within the groupcconcat().

SELECT SUBSTRING_INDEX(group_concat(value_1 
                                    ORDER BY value_1, value_3 DESC),',',1) as value_1, 
       SUBSTRING_INDEX(group_concat(value_4
                                    ORDER BY value_1, value_3 DESC),',',1) as value_4 
FROM table_2 join table_1 on value_3 < value_1 

WHERE value_1 BETWEEN 1325372400000 AND 1328050800000  

GROUP BY value_1  
ORDER BY value_1

Upvotes: 0

Related Questions