Reputation: 912
I have a MySQL table of 10 million rows and 3 columns, in following format:
id time num
ca65e871-d758-437e-b76f-175234760e7b 2020-11-14 23:08:05.553770 11112222222
...
For running the first query below, I indexed the table on (num, time)
and it works very fast (<5 milliseconds on 10 million rows table):
SELECT COUNT(*)
FROM TABLE_NAME
WHERE time >= '2020-11-14 23:08:05.553752' AND num = 11112222222
However I also need to execute count(distinct)
on the same table with between
clause, something like this:
SELECT COUNT(DISTINCT num)
FROM TABLE_NAME
WHERE time >= '2020-11-14 23:08:05.553752'
AND num BETWEEN (11112222222 - 30)
AND (11112222222 + 30)
This turns out to be significantly slower, around 200 milliseconds. Is there a way to speed the execution time of the second query on the same table?
Upvotes: 0
Views: 721
Reputation: 142433
This is a 2D problem. And your WHERE
clause is like a "bounding box".
Add INDEX(time, num)
to give the Optimizer another choice.
If you expect to have more complex queries based on 2 ranges, see http://mysql.rjweb.org/doc.php/find_nearest_in_mysql
Upvotes: 1
Reputation: 42728
If your MySQl is 8+ then try:
WITH RECURSIVE
cte AS ( SELECT 11112222222 - 30 num
UNION ALL
SELECT num + 1 FROM cte WHERE num < 11112222222 + 30 )
SELECT COUNT(*)
FROM cte
WHERE EXISTS ( SELECT NULL
FROM TABLE_NAME
WHERE TABLE_NAME.num = cte.num
AND time >= '2020-11-14 23:08:05.553752' )
If you'll often execute such query then I'd suggest to create service table with the numbers from -30 to 30 and use it instead of recursive CTE.
Upvotes: 1