Jerry
Jerry

Reputation: 1238

How to do fulltext search in multiple columns in MySQL, quickly?

I know this question has been asked several times.. but , let me explain.

I have a table with 450k records of users (id, first name, last name, address, phone number, etc ..). I want to search users by thei first name and/or their last name.

I used these queries :

SELECT * FROM correspondants WHERE nom LIKE 'Renault%' AND prénom LIKE 'r%';

and

SELECT * FROM correspondants WHERE CONCAT(nom, CHAR(32), prénom= LIKE 'Renault r%';

It works well, but with a too high duration (1,5 s). This is my problem.

To fix it, I tried with MATCH and AGAINST with a full text index on both colums 'nom' and 'prénom' :

SELECT * FROM correspondants WHERE MATCH(nom, prénom) AGAINST('Renault r');

It's very quick (0,000 s ..) but result is bad, I don't obtain what I should have.

For example, with LIKE function, results are :

88623   RENAULT Rémy
91736   RENAULT Robin
202269  RENAULT Régine

(3 results).

And with MATCH/AGAINST :

327380  RENAULT Luc
1559    RENAULT Marina
17280   RENAULT Anne
(...)
88623   RENAULT Rémy
91736   RENAULT Robin
202269  RENAULT Régine
(...)
436696  SEZNEC-RENAULT  Helene
(...)

(115 results !)

What is the best way to do a quick and efficient text search on both columns with a "AND" search ? (and what about indexes)

Upvotes: 3

Views: 2380

Answers (2)

Bill Karwin
Bill Karwin

Reputation: 562701

Fulltext search doesn't do pattern-matching as LIKE string comparisons do. Fulltext search only searches for full words, not fragments like r%.

Also there's a minimum size of word, controlled by the ft_min_word_len configuration variable. To avoid making the fulltext index too large, it doesn't index words smaller than that variable. And therefore short words are ignored when you search, so r is ignored.

There's also no choice in fulltext indexing to search for words in a specific position like at the beginning of a string. So your search for renault may be found in the middle of the string.

To solve these issues, you could do the following:

SELECT * FROM correspondants WHERE MATCH(nom, prénom) AGAINST('Renault')
  AND CONCAT(nom, CHAR(32), prénom) LIKE 'Renault r%';

This would use the fulltext index to find a small subset of your 450,000 rows that have the word renault somewhere in the string. Then the second term in the search would be done without help from an index, but only against the subset of rows that match the first term.

Upvotes: 2

Rick James
Rick James

Reputation: 142433

That particular query is best done this way:

INDEX(nom, prénom)

WHERE non = 'Relault' AND prénom LIKE 'R%'

I recommend that you add that index and add code to your application to handle different requests in different ways.

Do not hide an indexed column inside a function call, such as CONCAT(nom, ...), it will not be able to use the index; instead it will check every row, performing the CONCAT for every row and then doing the LIKE. Very slow.

Except for cases of initials (as above), you should mostly avoid very short names. However, here is another case where you can make it work with extra code:

WHERE nom = 'Lu'

(with the same index). Note that using any flavor of MATCH is likely to be much less efficient.

So, if you are given a full last name, use WHERE nom =. If you are given a prefix, then it might work to use WHERE nom LIKE 'Prefix%' Etc.

FULLTEXT is best used for cases where you have full words scattered in longer text, which is not your case since you have nom and prénom split out.

Perhaps you should not use MATCH for anything in this schema.

Upvotes: 0

Related Questions