Reputation: 21
I have a simple MySQL table with 40000 rows:
id CHAR(36) (i am using UUIDs as primary keys)
number_from (INT)
number_to (INT)
with 3 indexes:
primary(id)
number_from(number_from)
number_to(number_to)
it looks like simple query like following takes fairly large amount of time (like 30 seconds or more):
SELECT * FROM numbers WHERE (number_from >= 703911711 AND number_to <= 703911800)
it there anything i can do to optimize performance?
Upvotes: 2
Views: 1702
Reputation: 31
Old post, I know, but I like to keep StackOverflow a bit up-to-date if possible.
So, if you need a RANGE(from,to)
kind of function, please note that nowadays MariaDB has an ENGINE called SEQUENCE that will do this for you:
SELECT `seq` FROM `seq_0_to_100`
as long as the from and to numbers are within the limits of BIGINT(20) UNSIGNED
Upvotes: 0
Reputation: 48357
Conventional indexes don't work when the bounds of your range are stored in different attributes. As lots of people have discovered and documented, the solution is to use geospatial indexing. This was the first hit in google.
Since mysql geospatial indexing works in 2 dimensions, the trick is to map the range to one dimension and use a fixed scale in the other dimension.
Upvotes: 0
Reputation: 142298
You have two problems that are not easily solved.
But first, let's make sure you have the cache settings right. If you are using MyISAM, key_buffer_size
should be about 20% of available RAM. Or for InnoDB, innodb_buffer_pool_size
should be about 70% of available RAM. If you had small settings, this change would only move the problem from being I/O to being CPU. And it will not scale past when you cannot cache most of the table and its indexes in RAM. Reference
My blog explains why UUIDs are terrible for large indexes. That also provides a Stored Function to make the UUID smaller (BINARY(16)
), hence more cacheable; this will help with the I/O.
If you have "type-1" UUIDs (such as generated by MySQL), the blog also explains why, with the help of the Function, use of UUIDs can be scaled. But if it is some other flavor of UUID, and you must have at least one BTree (INDEX
or PRIMARY KEY
) with the UUID in it, your task is not scalable. Can you turn it into an AUTO_INCREMENT
? That would allow INSERTs
to be clumped at the 'end' of the table rather than randomly scattered around the table, thereby blowing out cache?
As for the ranges, you need to revamp the table to put only one of from/to in each row. This involves extra rows for the 'unowned' ranges. Then write a stored routine to use LIMIT 1
to prevent what you are seeing -- namely failure to effectively use any index. Another blog goes into the details, together with schema design and reference code for IPv4 (which might be what you are doing) and IPv6. My code is very scalable and eliminates the indexing problem you posed.
Upvotes: 0
Reputation: 311188
Having the indexes separated will force MySQL to scan them separately. If you index both number_from
and number_to
together, MySQL could perform the entire where
clause on a single index:
CREATE INDEX number_from_number_to_ind
ON numbers (number_from, number_to)
Adding the id
to the index will actually allow MySQL to perform the entire query without accessing the table, but at the expense of making the index significantly larger. You'd have to benchmark and see if this actually improves the query's performance or not:
CREATE INDEX all_columns_ind
ON numbers (number_from, number_to, id)
Upvotes: 1