Reputation: 4762
There are certain characters (operators) that affects behavior of full-text search in MariaDB. They are +-<>()~*"
and their functionality is described in documentation.
I want to be able to search for a word which contains one of these operators and I want MariaDB to deal with it as a normal character not an operator. How can I do it?
Example:
Let's create table with full-text index:
CREATE TABLE users (username TEXT, FULLTEXT(username)) ENGINE=InnoDB;
INSERT INTO users(username) VALUES ('joseph'), ('jose'), ('jose*');
Now I want to search for rows which contains exactly jose*
:
SELECT * FROM users WHERE MATCH(username) AGAINST('jose*' IN BOOLEAN MODE);
+----------+
| username |
+----------+
| joseph |
| jose |
| jose* |
+----------+
But I want only row with jose*
. The same result is when I try to escape that string the way I would expect it could work.
SELECT * FROM users WHERE MATCH(username) AGAINST('jose\*' IN BOOLEAN MODE);
+----------+
| username |
+----------+
| joseph |
| jose |
| jose* |
+----------+
SELECT * FROM users WHERE MATCH(username) AGAINST('jose\\*' IN BOOLEAN MODE);
+----------+
| username |
+----------+
| joseph |
| jose |
| jose* |
+----------+
What is the proper way to escape string for full-text search in MariaDB/MySQL?
Upvotes: 3
Views: 2993
Reputation: 11106
A fulltext search is a tool to efficiently search for words (or the beginning of words) that appear anywhere in the (full) text. If your data does not contain separated "words" (in whatever way you want to define them), a fulltext index is not the right tool for your task (as it will be completely useless). By default, *
is a word delimiter just like space (so e.g. 'abc*def'
, as well as 'abc def'
, are two words, with two seperate entries in the fulltext index, none of which will contain the *
). You can specify what you want to be a delimiter, but MySQL does not support to specify it on the fly by escaping them in the search expression; you need to do it when you create the index, so the index will actually contain jose*
, and not just jose
.
If you do not have words (or a very limited set of delimiters), you can use e.g. username = 'jose*
, username like 'jose*'
or similar; alternatively, you could use regular expressions, which are slow, but the fallback tool for complicated requirements (e.g. if a fulltext does not work for your situation) where a fulltext index is not usabel (and/or you cannot change the configuration to fit it to your requirements).
To change what characters MySQL treats as a delimiter, you can alter the character map, see Adding a Collation for Full-Text Indexing:
index.xml
latin1.xml
), and edit the ctype
to define a specific character as a (non-)delimiter; for only *
, change it to "48 10 10 10 10 10 10 10 10 10 01 10 10 10 10 10"); do that for all characters you want to be searchable (but again, keep in mind that, if you do not have at least one remaining delimiter, a fulltext search is useless).... (username TEXT collate 'latin1_fulltext_ci', ...
) and recreate the fulltext index, and MySQL will include those characters into the index.Now the following three searches should return the expected results:
... MATCH(username) AGAINST('"jose*"' IN BOOLEAN MODE);
... MATCH(username) AGAINST('jose*');
... MATCH(username) AGAINST('"jose*"');
"..."
will look for an exact match (e.g. a word combination); it works similar to escaping, but not quite, as it will only apply to non-delimiter characters.
... MATCH(username) AGAINST('jose*' IN BOOLEAN MODE);
will not work for InnoDB (it will be treated as a wildcard), but will work for MyISAM (one of some subtle differences between those).
If you actually want to utilize the boolean mode, but need a wildcard other than *
, you could define a different wildcard symbol using ft_boolean_syntax
, although due to a bug in InnoDB this also only works in MyISAM. It's also a global setting, so would change the behaviour of all other fulltext searches in other tables (and databases). You may have to specify what you want to achieve with this mode to see if there is a way to make the fulltext search work with these requirements, but ultimately, you may have to fall back to using like
.
Upvotes: 3