Ramon K.
Ramon K.

Reputation: 3502

MySQL/RDBMS: Is it okay to index long strings? Will it do the job?

Let's suppose I have a table of movies:

+------------+---------------------+------+-----+---------+----------------+
| Field      | Type                | Null | Key | Default | Extra          |
+------------+---------------------+------+-----+---------+----------------+
| id         | bigint(20) unsigned | NO   | PRI | NULL    | auto_increment |
| title      | tinytext            | YES  |     | NULL    |                |
| synopsis   | synopsis            | YES  |     | NULL    |                |
| year       | int(4)              | YES  |     | NULL    |                |
| ISBN       | varchar(13)         | YES  |     | NULL    |                |
| category   | tinytext            | YES  |     | NULL    |                |
| author     | tinytext            | YES  |     | NULL    |                |
| theme      | tinytext            | YES  |     | NULL    |                |
| edition    | int(2)              | YES  |     | NULL    |                |
| search     | text                | YES  |     | NULL    |                |
+------------+---------------------+------+-----+---------+----------------+

In this example, I'm using search column as a summary of the table. So, a possible record would be like the following:

+------------+-------------------------------------------------------------+
| Field      | Value                                                       |
+------------+-------------------------------------------------------------+
| id         | 1                                                           |
| title      | Awesome Book                                                |
| synopsis   | This is a cool book with a cool history                     |
| year       | 2013                                                        |
| ISBN       | 1234567890123                                               |
| category   | Horror                                                      |
| author     | John Doe                                                    |
| theme      | Programmer goes insane                                      |
| edition    | 2nd                                                         |                        
| search     | 2013 horror john doe awesome book this is a cool book (...) |
+------------+---------------------+------+-----+---------+----------------+

This column search will be the one scanned when a search is made. Notice that it has all the words of other fields, in lower case, and possibly some extra words to help on a search.

I have two questions about it:

1) Knowing that this column is a text field and can get really big, is it okay to index it? Will it improve the performance as expected? Why?

2) Despite the index, is it a good idea to use this method to search or is it better to try every column on my query? How can I improve it?

OBS: I don't really have this table, it's just for example purposes. Please ignore any error in datatypes or syntax I may have done.

Upvotes: 0

Views: 1463

Answers (2)

Neville Kuyt
Neville Kuyt

Reputation: 29619

1) Knowing that this column is a text field and can get really big, is it okay to index it? Will it improve the performance as expected? Why?

Yes, you can index it, but no, it won't improve performance. An index on string-type columns only helps when the query matches the start of the column - so in your case, someone searching '2013 horror john' would hit the index, but someone searching 'horror john 2013' would not.

2) Despite the index, is it a good idea to use this method to search or is it better to try every column on my query? How can I improve it?

As Gordon Linoff writes, the best solution is probably full text searching - this is blazingly fast for text searches, deals with "fuzzy" matching, and generally allows you to write a search function similar to the way google works.

Upvotes: 2

Gordon Linoff
Gordon Linoff

Reputation: 1269443

Indexing the search column is not helpful.

What you may want is full text search capabilities on the column, which you can read about here.

Which you use for search depends on whether the searches will be using context. If someone searches for "Clinton", do you want them to restrict the search to authors named "Clinton" or to books about "Clinton"? If you don't care about the context, then full text on one field is quite reasonable.

I need to add: you don't need to put all the search terms in a separate field to use full text search. You can create a full text index on multiple columns. This gives you the flexibility of using full text searches with context (by looking only in specific columns) or without context (by looking in all of them). Your question was about the search column in particular, but that is not the best way to implement the functionality that you are looking for.

Upvotes: 1

Related Questions