Searching in a historical table for a row in a timestamp, is so slow

Question

I'm working with historical tables which are storing each change from "real" tables.

At the moment of retrieving data in a list of timestamps, the performace is horrible.

Here a simplified version of my tables.

CREATE TABLE `changes` (
  `ts` datetime DEFAULT NULL
) ENGINE=InnoDB;

CREATE TABLE `history` (
  `h_id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
  `start_ts` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
  `end_ts` timestamp NULL DEFAULT NULL,
  `pk` int(11) DEFAULT NULL,
  `note` mediumtext,
  PRIMARY KEY (`h_id`),
  KEY `history_pk_ts_tsev_IDX` (`pk`,`start_ts`,`end_ts`) USING BTREE
) ENGINE=InnoDB;

And this is the query i'm running:

SELECT * FROM `changes` AS `c`
JOIN `history` AS `h`
ON (`h`.`pk` = 9999
    AND `c`.`ts` >= `h`.`start_ts`
    AND `c`.`ts` < IFNULL(`h`.`end_ts`, `c`.`ts` + 1)
   )

With 2.500 row in changes and 55.000 rows in history, the query takes about 8 seconds to get the first row, and about 2 minutes to get all.

This tables will grow very rapidly, storing soon milions of rows.

This is the explain result:

|id |select_type |table |partitions |type |possible_keys          |key |key_len |ref |rows  |filtered|Extra                                              |
|---|------------|------|-----------|-----|-----------------------|----|--------|----|------|--------|---------------------------------------------------|
|1  |SIMPLE      |c     |           |ALL  |                       |    |        |    |2448  |100     |                                                   |
|1  |SIMPLE      |h     |           |ALL  |history_pk_ts_tsev_IDX |    |        |    |54227 |16.66   |Using where; Using join buffer (Block Nested Loop) |

I've tried to force the index:

SELECT * FROM `changes` AS `c`
JOIN `history` AS `h` FORCE INDEX (history_pk_ts_tsev_IDX)
ON (`h`.`pk` = 2476
    AND `c`.`ts` >= `h`.`start_ts`
    AND `c`.`ts` < IFNULL(`h`.`end_ts`, `c`.`ts` + 1)
   )

but now this query takes about 10 seconds for the first row. Again, the explain:

|id |select_type |table |partitions |type |possible_keys          |key                    |key_len |ref   |rows  |filtered|Extra                 |
|---|------------|------|-----------|-----|-----------------------|-----------------------|--------|------|------|--------|----------------------|
|1  |SIMPLE      |c     |           |ALL  |                       |                       |        |      |2448  |100     |                      |
|1  |SIMPLE      |h     |           |ref  |history_pk_ts_tsev_IDX |history_pk_ts_tsev_IDX |5       |const |27113 |33.33   |Using index condition |

I've tried to create indexes in any order and combinations of them with no luck:

(pk, start_ts, end_ts)
(start_ts, end_ts)
(end_ts)
(end_ts, start_ts)
...

I've tested in the biggest MySQL RDS available in AWS with the same results, hence is not a hardware problem.

I'm really lost. What am I missing?

Thanks.

Searching in a historical table for a row in a timestamp, is so slow

Answers (1)

Related Questions