Mercury
Mercury

Reputation: 732

What data type and index to use for a searchable userName column in a MySQL database?

I have a MySQL database that has a InnoDB table called user.

Each user is required to create a user name during the creation of the account. The user name can only be between 1 and 40 characters (utf-8) long. The purpose of this user name is to allow for one user to search for another.

The problem is that I am not sure what datatype to use for the user name. Since the user name will be variable length a Varchar seems fitting but I am uncertain on how fast it is to index and do searches on a Varchar. The thought I had is that because I know the maximum character length I could use a Char(40) instead. Although this would waste some space due to the padding I would imagine it would make indexing and searching faster and more efficient since it is fixed length.

The other aspect I am uncertain about is if I should use a Fulltext index or a BTree index for this user name column. Or if there is an other index method I don't know about that would work better.

So which datatype in correlation with which index would give me the best, fastest and most scalable solution to this ?

Upvotes: 4

Views: 3454

Answers (2)

Rick James
Rick James

Reputation: 142356

username VARCHAR(40) CHARACTER SET utf8mb4 NOT NULL

INDEX(username)

WHERE username = 'some_username'

This is the only sane answer. You need up to 40 UTF-8 characters. Users will search only for full names, no partial stuff (LIKE, FULLTEXT, etc). You need to index, so don't worry that it is slightly less efficient than INT.

One possible variation: You may want username to be the PRIMARY KEY (instead of a 'secondary' INDEX) in some table.

Upvotes: 1

Blag
Blag

Reputation: 5894

Char vs Varchar

Char is useful when you have a fixed length, like with product reference or ISO-* field. Why ? because it'll spare you the use of a length byte on every line.

But in your case, the name will mostly never be over 20char, so to spare a single byte you'll burn more than 20 in 75% of your records: not the good way.

Varchar, definitely.


Index BTree vs FullText

Next, the index. As said, it depend of what kind of search you'll do.

For search that use ='pattern' or LIKE 'pattern%' the BTree index is really the best.

For search that only do LIKE '%pattern%' to search a partial sub string in a string, then, you don't really have anything.

The best you can do is having a "small" table that just map id = name, make your search on it and hope mysql will keep it in RAM to do the sequential read.

And last, you spoke about Fulltext index: this index is best suited for matching a word or begging of word in a text, so probably the same result than a BTree for you.

So, what the best way ?

Probably a BTree index. Yes you'll not cover the search inside the word, but for most case it'll be enough.

How to use it the best way ?

  • Run a first search with LIKE 'pattern%'
  • You have a result ? then this is probably what the user want
  • No ? Go for a much slower LIKE '%pattern%'

Improvement :

  • allowing the user to check an "inside word" modifier that go straight to the second search
  • fixing a minimal number of line for the first query to avoid the 2nd and not just a strict 0 result

Upvotes: 3

Related Questions