Crayl
Crayl

Reputation: 1911

MySQL - Which of these both is the better index?

I have a website with a database of 2,4 million entries. Now I want to use mod_rewrite to produce clear URLs like example.com/this-is-a-beautiful-url. To recognize which URL belongs to which page I want to build a URL-table which refers to the website's content. Like "this-url-example" belongs to the post with the ID 123. Now 2 friends are telling me 2 different ways to build this table and I don't know which one is the best.

  1. Advice: "Make a MD5 hash of the URLs and save them in a binary field with an index".
  2. Advice: "Simply save the urls in plain text and put an index on it".

Which method would provide faster select-queries? Do you have another advice? Thanks!

Upvotes: 3

Views: 93

Answers (3)

Brian Minton
Brian Minton

Reputation: 3777

I would not use either mysql or md5 hashing. Instead, use a key-value store such as dbm or leveldb. If you are using Apache's mod_rewrite, it has a feature, RewriteMap that can read the keys and values directly from a dbm db, or by calling an external program. The type can be sdbm, gdbm, ndbm or db (Berkeley DB), and apache includes a program, httxt2dbm, to create the database. If you do want to use MySQL, RewriteMap can do that as well with the dbd option, and mod_dbd. Performance-wise, using a hash index on MySQL should be similar to Berkley DB.

Upvotes: 0

user1745602
user1745602

Reputation: 29

MD5 Hash was designed for data comparison purposes. I.e. checking that a password is valid or that a file remains unchanged.

If you plan on being able to change and modify the URL (for search engine optimization purposes) I would stick to Plain text.

Also if you want to create Mod Rewrite Commands (in your HTACCESS) file - it will need to be plain text to enable you to link that URL with a particular page.

Upvotes: 2

vvondra
vvondra

Reputation: 3162

Go for a compromise, have urls as sitename.com/12345/clean-url-format

You have both keywords in a readable URL and the identified which you can use to fetch the row by the primary key for speed.

It would be reasonable to redirect urls without the text part to the canonical version with it.

Upvotes: 2

Related Questions