zugabe
zugabe

Reputation: 31

MariaDB FTS skips stopwords in Boolean mode

I am using MariaDB 10.6.19 with a full text search in boolean mode. If I understood the knowledge base correctly, stopwords should not be ignored here and should be found as well.

Simple example I used was a search for "Will Smith". "will" is just being ignored and the search is similar to a search for "Smith" alone...

I am looking for either a explanation of the described behaviour or even better a different method to efficiently search my database for Will Smith ;)

Edit: the same problem persists with both MyISAM and InnoDB

Upvotes: 0

Views: 27

Answers (1)

Georg Richter
Georg Richter

Reputation: 7516

As mentioned in the documentation stop words will be ignored, if the innodb_ft_enable_stopword variable was set to OFF at index creation time:

set session innodb_ft_enable_stopword=ON;
create or replace table t1 (a varchar(255), fulltext(a)) engine=InnoDB;
insert into t1 values ("Will Smith"), ("Hardy Smith"), ("Luke Skywalker"), ("Monty Widenius"), ("Harry Beef"), ("Norman Goodchild");

Now as expected match() function will return a score 0, since exact match fails (will is not stored in fulltext index):

select a from t1 where match(a) against ("+Will +Smith" in boolean mode);
Empty set (0.001 sec)

When disabling stopwords, you will get the expected result:

set session innodb_ft_enable_stopword=OFF;
# recreate index
alter table t1 force;
select a from t1 where match(a) against ("+Will +Smith" in boolean mode);
+------------+
| a          |
+------------+
| Will Smith |
+------------+

Calling the match function without + (which means the word is mandatory in all rows returned) it will return a value > 0 for all rows which contain one or more matching words, regardless if innodb_ft_enable_stopword was enabled or disabled:

select a, match(a) against ("Will Smith" in boolean mode) as ranking from t1;
+------------------+---------------------+
| a                | ranking             |
+------------------+---------------------+
| Will Smith       |  0.8331640362739563 |
| Hardy Smith      | 0.22764469683170319 |
| Luke Skywalker   |                   0 |
| Monty Widenius   |                   0 |
| Harry Beef       |                   0 |
| Norman Goodchild |                   0 |
+------------------+---------------------+

Upvotes: 0

Related Questions