jpp1jpp1
jpp1jpp1

Reputation: 181

Very slow Minimum value group by query

the following query:

select t.code, t.value, t.date
from readings t inner join (
    select code, min(date) as MinDate
    from readings
    where date >= "2018-11-1"
    group by code
) tm on t.code = tm.code and t.date = tm.MinDate

is taking +1min to respond.

The subquery:

select code, min(date) as MinDate
from readings
where date >= "2018-11-1"
group by code

returns in one second. Plus if you take the whole query and replace min() with max() it also takes just one second.

I'm using HeidiSQL and the strange thing is it reports: 1,578 sec. (+ 61,172 sec. network). But this makes no sense as the max() version returns roughly the same amount as data and only takes one sec.

Current indexes:

PRIMARY             BTREE   Yes No  id  31096905    A   YES      
code_date_unique    BTREE   Yes No  code    0       A   YES      
                                    date    0       A   YES
    code            BTREE   No  No  code 15801      A   YES      
    date            BTREE   No  No  date 1943556    A   YES      

Create Table:

CREATE TABLE `readings` (
`id` INT(15) NOT NULL AUTO_INCREMENT,
....
`code` VARCHAR(50) NULL DEFAULT NULL COLLATE 'latin1_spanish_ci',
`value` FLOAT(10,2) NULL DEFAULT NULL,
`date` DATETIME NULL DEFAULT NULL,
....
PRIMARY KEY (`id`),
UNIQUE INDEX `code_date_unique` (`code`, `date`),
INDEX `code` (`code`),
INDEX `code_DateSaved` (`code`, `dateSaved`),
INDEX `date` (`date`),
INDEX `datesaved` (`dateSaved`),
INDEX `origen` (`origen`)
)
COLLATE='utf8_spanish_ci'
ENGINE=MyISAM
AUTO_INCREMENT=584809517

Explain:

explain select  t.code, t.value, t.date
from readings t
inner join (
    select code, min(date) as MinDate
    from readings
    where date >= "2018-11-1"
    group by code
) tm on t.code = tm.code and t.date = tm.MinDate 

+------+---------------+--------------+---------+---------------------------------------------+--------------------+-----------+--------------+---------+-----------------------------------------+
| "id" | "select_type" |   "table"    | "type"  |               "possible_keys"               |       "key"        | "key_len" |    "ref"     | "rows"  |                 "Extra"                 |
| "1"  | "PRIMARY"     | "<derived2>" | "ALL"   | \N                                          | \N                 | \N        | \N           | "15052" | ""                                      |
| "1"  | "PRIMARY"     | "t"          | "ref"   | "code_date_unique,code,code_DateSaved,date" | "date"             | "9"       | "tm.MinDate" | "16"    | "Using where"                           |
| "2"  | "DERIVED"     | "readings"   | "range" | "date"                                      | "code_date_unique" | "62"      | \N           | "10"    | "Using where; Using index for group-by" |

Upvotes: 0

Views: 875

Answers (1)

Tim Biegeleisen
Tim Biegeleisen

Reputation: 522254

I don't know how to make your query faster by restructuring it. But we can try adding the following index to the readings table:

(code, date, value)
CREATE INDEX your_idx ON readings (code, date, value);

This composite index should speed up the GROUP BY subquery, making it easy for MySQL to find the minimum date for each group (and also making the WHERE clause easier to restrict even before the GROUP BY).

I edited my answer to include value in the index, so that all the columns in your SELECT are covered by the index.

Upvotes: 1

Related Questions