parchambeau
parchambeau

Reputation: 1141

Optimize MYSQL ranking mod query

I'm having some major issues getting a query that thins out a data set to return results in a timely manner. I've pasted below the indexes on this table (which I thought would be enough to improve the speed) as well as the logic of the query.

One thing I tried was removing the "ORDER BY" in the inner query, but that actually did not improved the timing on it whatsoever, as well as cleaning out some additional unneeded columns I was selecting which improve the time, but did not make it "fast enough".

SELECT unix_date, price FROM
(SELECT @row := @row +1 as row_num, unix_date, price 
FROM (SELECT @row:=0, unix_date, price FROM 
price_data WHERE created_date >= '2017-03-26 00:00:00' AND created_date 
<= '2017-06-26 23:59:59' AND currency= 'USD' ORDER BY unix_date DESC) 
AS p) AS d
WHERE MOD(row_num, 288) = 1;

The whole point of this query is just trying to return a result set of price data points (unix timestamp, price) but thin it out to only return every 288th (or X) data points. This table is honestly pretty small currently (total rows: 198109) so I am having a hard time understanding why the query would take so long to return.

Here are the indexes on the table currently:

| Table              | Non_unique | Key_name     | Seq_in_index | 
Column_name  | Collation | Cardinality | Sub_part | Packed | Null | 
Index_type | Comment | Index_comment |
+--------------------+------------+--------------+--------------+------
--------+-----------+-------------+----------+--------+------+---------
---+---------+---------------+
| price_data |          0 | PRIMARY      |            1 | 
unix_date    | A         |      200002 |     NULL | NULL   |      | 
BTREE      |         |               |
| price_data |          0 | PRIMARY      |            2 | 
currency     | A         |      200002 |     NULL | NULL   |      | 
BTREE      |         |               |
| price_data |          1 | created_date |            1 | 
created_date | A         |      200002 |     NULL | NULL   |      | 
BTREE      |         |               |
| price_data |          1 | price        |            1 | price        
| A         |      200002 |     NULL | NULL   | YES  | BTREE      |         
|               |
+--------------------+------------+--------------+--------------+------
--------+-----------+-------------+----------+--------+------+---------
---+---------+--

Per suggestion I've added the create table:

  CREATE TABLE `price_data` (
  `created_date` datetime NOT NULL,
  `unix_date` int(11) NOT NULL,
  `currency` varchar(255) NOT NULL DEFAULT '',
  `price` decimal(10,6) DEFAULT NULL,
  PRIMARY KEY (`unix_date`,`currency`),
  KEY `created_date` (`created_date`),
  KEY `price` (`price`)
    ) ENGINE=InnoDB DEFAULT CHARSET=latin1 |

Any advice on how to improve the speed of this query would be greatly appreciated.

EDIT: Could the issue be something along the lines of the final WHERE of this query, is actually evaluating information against a "fake column" of row_num, which is derived from the previous inner queries, thus does not really have an index on it per se? So when the WHERE is evaluating it is not doing so with the speed of something a normal indexed column would?

Upvotes: 1

Views: 334

Answers (1)

Bill Karwin
Bill Karwin

Reputation: 562310

Change the primary key to currency, unix_date. Since you have an equality condition for currency, and a range condition for unix_date, you should put the column with the equality condition first. Then both the range condition on unix_date and the ORDER BY should use the primary key order.

Apply the condition to unix_date, not create_date, to get it to use the primary key index.

You have to use a derived table subquery, but you don't have to use two nested levels of subqueries.

SELECT row_num, unix_date, price
FROM (
    SELECT @row := @row + 1 AS row_num, unix_date, price
    FROM (SELECT @row := 0) AS _init
    CROSS JOIN price_data
    WHERE currency = 'USD' 
     AND unix_date BETWEEN UNIX_TIMESTAMP('2017-03-26 00:00:00') 
                       AND UNIX_TIMESTAMP('2017-06-26 23:59:59')
    ORDER BY unix_timestamp DESC
) AS t
WHERE MOD(row_num, 288) = 1

You should learn to use EXPLAIN to help you analyze index usage.

You might also like my presentation How to Design Indexes, Really, and the video: https://www.youtube.com/watch?v=ELR7-RdU9XU

MySQL 8.0 should have windowing functions, so look for that next year sometime.

Upvotes: 2

Related Questions