mscard02
mscard02

Reputation: 315

MySQL Innodb Full text Proximity Search Gives Horrible Performance

I have a mysql database with a simple table named item. Item contains the following fields and has 55,000 records:

ID (PK) Description (INNODB FULL TEXT) DATE

By design i am forced to index all the way down to 1 character words since some descriptions contain names such as Item 1 a 2 42 where the spaces MUST be kept intact. I am running a full text search against this table and here are my results for the following code:

Select ID, Description, Date
From Item
WHERE date > DATE_SUB(NOW(), INTERVAL 15 MONTH)
AND description LIKE CONCAT('%item 1 2 a 4%')
AND MATCH (description) AGAINST ('+item +1 +2 +a +4' in boolean mode);

This Query returns in 1.2 seconds. However, once i add the following proximity search to the query my return time goes through the roof:

Select ID, Description, Date
From Item
WHERE date > DATE_SUB(NOW(), INTERVAL 15 MONTH)
AND description LIKE ('%item%')
AND MATCH (description) AGAINST ('+item +1 +2 +a +4' in boolean mode)
AND MATCH (description) AGAINST ('"1 2 a 4" @30' in boolean mode);

This Query returns in 54 seconds! The proximity search is a must to my query since i need to make sure i find "item 1 2 a" and not "item 1 2 48884848 222 a" which would be totally different. Proximity search runs a lot better when the words are more than 1 characters, but there are some circumstances that would call for a user typing in 1 character words. Is there ANYTHING else i can do that would be an alternative to proximity searching of full-text innodb but have much better performance? If there is nothing else in MYSQL, i am open to using something to integrate to give me a better proximity search (i am on windows though).

thanks!

Upvotes: 0

Views: 908

Answers (2)

ShaohuaWang
ShaohuaWang

Reputation: 1

Can we use the following rewritten query?

SELECT ID, Description, Date
FROM Item
WHERE date > DATE_SUB(NOW(), INTERVAL 15 MONTH)
AND MATCH (description) AGAINST ('"1 2 a 4" @30 +item' in boolean mode);

Proximity is slow because it's complicated. Suggest you use no more than two words for proximity search. As the word number increases, the query time increases more.

Upvotes: 0

RandomSeed
RandomSeed

Reputation: 29759

The LIKE condition defeats your full-text indexes. Replace the condition

description LIKE CONCAT('%item 1 2 a 4%') -- and why CONCAT() anyways?

... with

MATCH (description) AGAINST ('"item 1 2 a 4"' IN BOOLEAN MODE)

Your second condition is useless as it overlaps with the first condition (you have already filtered by the exact string "item 1 2 a 4", these rows always match '+item +1 +2 +a +4').

Upvotes: 2

Related Questions