kari.patila
kari.patila

Reputation: 1079

Workaround for the MySQL 50% threshold for natural search

I have a table where I'm inserting, for instance, images and the names of the colors found in said images. The color string looks something like:

white, yellow, orange, black

Since I have a lot of these, the 50% threshold is starting to drop off some colors, since they appear on most of the rows.

The whole point of the table is to be able to search by color. Does anyone have a workaround, or should I just give up and go for Sphinx or something similar? Recompiling MySQL is probably not an option.

I've heard that some people just add dummy rows to the table to bypass the 50%, but that sounds pretty desperate.

Upvotes: 0

Views: 727

Answers (2)

bobince
bobince

Reputation: 536587

The whole point of the table is to be able to search by color. Does anyone have a workaround

Yes, use BOOLEAN MODE searches, which are not affected by the 50% threshold and are less unpredictable.

But, +1 to tpdi's answer, I can't imagine why you're using fulltext search instead of a simple join table listing the colours. It's a simple yes/no logical storage problem, there's no need to drag in the fulltext complexities of word-splitting, word ordering, stopwords, too-short words (“red” will not be indexed by default) and most of all the fact that you have to use nasty old MyISAM tables to get the feature at all!

Fulltext search is hard in general, and not particularly well-implemented in MySQL. It should be your last resort for when you really do need to search for words in large stretches of text, and not as a feature of choice.

Upvotes: 2

tpdi
tpdi

Reputation: 35171

This is NOT a problem to be solved with full text search.

Instead, you need a child table with one row per color per image.

The color should be a foreign key to a color table, so your child table actually becomes a many-to-many relation between image and color.

create table color ( 
 id int not null primary key auto_increment,
 name varchar64)
); 

create table image_color (
  image_id int references image(id), 
  color_id int reference color(id),
  unique constraint (image_id, color_id) 
) ;

Then index the relation (and give it a unique constraint on the tuple (image_id, color_id).

Then to find all images with a particular color:

select a.* from image a 
join image_color b on (a.id = b.image_id) 
join color c on (b.color_id = c.id)
where c.name = 'yellow';

Upvotes: 6

Related Questions