Reputation: 5147
MySQL table with couple of fields:
id
- PRIMARY KEY
url
- CHAR(255)
. url
field is also unique and indexed. Currently have couple of hundreds of thousands entries in that table. MySQL gets really slow.
The idea is: if I add hash BIGINT UNIQUE INDEXED
field, and create composite index hash/url
- will it work faster? Means will MySQL first check my hash/url
pair using hash
and then url
? Will it be faster?
If answer is yes - for what reason this is not transparently implemented for indexed strings in MySQL?
Upvotes: 1
Views: 1366
Reputation: 1954
+1 Quassnoi
Doing a lookup on hash would definitely be much faster. I created a similar style table but stored very long URIs, and had to use a TEXT type, so a composite key didn't apply there. URLs and URIs are often much longer than a char field can be. I've stored them broken down into multiple tables, and even then, after a few million URLs, I had to consider partitioning the URL/URIs tables. Roughly:
table hostnames:
id int(11),
hostname char(255),
unique ( hostname )
table uri:
id int(11),
digest char(33),
uri text,
unique digest
table querystr:
-- like uri table
table urls:
id int(11),
fk_host int(11),
fk_uri int(11),
fk_query int(11)
unique u ( id, fk_host, fk_uri, fk_query )
If you need to search through a few million URLs or URIs in a free-text manner, using LIKE '%foo%'
won't work well. Consider a more text-search oriented MySQL extension or a library like Lucene/Solr for heavy searching.
Upvotes: 0
Reputation: 425251
The idea is: if I add hash BIGINT UNIQUE INDEXED field, and create composite index hash/url - will it work faster?
The key lookup on a BIGINT
field is faster than on a CHAR
field (surprisingly, performance gain is more noticeable on a key miss than on a key hit).
Note, however, that a BIGINT
hash has very high probability of hash collisions, that's why I'd not recommend to use it in a UNIQUE
field
Means will MySQL first check my hash/url pair using hash and then url?
If you create a composite key on (hash, url)
and search for the hash
only, it will use the ref
condition on the hash
part of the index.
Upvotes: 2