Newbie1
Newbie1

Reputation: 157

How to optimize the query speed by using hashtable

There are about 1 million records. Query is needed for the pagination system.

The query looks like this:

SELECT field1, field2, field3 FROM table 
WHERE field4 = '$value' 
ORDER BY field5 ASC limit $offset, 30;

There are indices on field4 and field5.

I heard that:

making another table (table 6) that is an indexed hash of table4?

Searching numbers instead of text is going to be a lot faster so the query is something like:

SELECT field1, field2, field3 Force
Index(Table6) FROM table WHERE field 6
= '$hashvalue' AND field4 = '$value' ORDER BY field5 ASC limit $offset, 30;

It should help to eliminate 99.99% of data before it has to text search and should speed up your queries regardless of the offset.

How exactly should I implement it? Could you please help me understand the idea of hash tables in this example?

Upvotes: 3

Views: 789

Answers (2)

mike
mike

Reputation: 5213

The hash alternation may help, but the major bottleneck will be when users request high $offset.
In this case it is wise to have id auto increment primary key which you can utilize to perform the pagination. See this example

select id, name
FROM table
LIMIT 0, 3;

Returns something similar to

+----+-----------------------------------+
| id | name                              |
+----+-----------------------------------+
|  1 | Beauty                            |
|  4 | Goodbye Country (Hello Nightclub) |
|  5 | Glee                              |
+----+-----------------------------------+

You can see that the last ID is 5. When user requests 2nd page, instead of doing query such as

select id, name
FROM table
LIMIT 3, 3;

you can write this query

select id, name
FROM table
WHERE id > 5
LIMIT 0, 3;

Because the last ID was 5, writing WHERE id > 5 returns rows directly following the last row on the 1st page which represent rows the user wants to see on the 2nd page.

If you were to use the first query (LIMIT 3,3), MySQL would have to find the first 3 rows on 1st page in order to determine what the 4-6 rows will be.

You didn't mention what the type of field5 is, but you may be able to apply the same methodology to speed up your pagination. Please keep in mind that the field has to be indexed in order for this to work.

Upvotes: 1

Quassnoi
Quassnoi

Reputation: 425381

I believe the author meant an index on a hash, not a hash table:

ALTER TABLE mytable ADD field4_hash BINARY(16);

UPDATE mytable SET field4_hash = UNHEX(MD5(field4));

CREATE INDEX ix_mytable_field4hash_field5 ON mytable (field4_hash, field5)

SELECT  field1, field2, field3
FROM    mytable 
WHERE   field4_hash = UNHEX(MD5('$value'))
        AND field4 = '$value'
ORDER BY
        field5 ASC
LIMIT   $offset, 30;

The idea is that an MD5 hash of a string is normally shorter than the string itself and the index lookups on the hash are hence more efficient than those on the strings.

Upvotes: 2

Related Questions