Jürgen Steinblock
Jürgen Steinblock

Reputation: 31743

MySQL Fulltext search: How do I match a string with a minus char?

I have a table

CREATE TABLE `ftx_utf8` (
  `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `col1` varchar(45) NOT NULL,
  `col2` text NOT NULL,
  PRIMARY KEY (`id`),
  FULLTEXT KEY `FTX` (`col1`,`col2`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8

with the content

|id|col1   |col2   |
|1 |255-381|       |
|2 |       |255-381|
|3 |255381 |       |
|4 |       |255381 |

Now If I execute the query:

SELECT id, MATCH(col1, col2) AGAINST ('"255381"' IN BOOLEAN MODE) AS match1
FROM ftx_utf8
HAVING match1 > 0

it returns rows 3 and 4

But this query

SELECT id, MATCH(col1, col2) AGAINST ('"255-381"' IN BOOLEAN MODE) AS match1
FROM ftx_utf8
HAVING match1 > 0

does not return rows 1 and 2 as expected. How can I match 255-381 with MySQL Fulltext search?

The odd thing is that a customer claims it worked before. But last week we did an update of his database from an early 5.0 installation to a 5.1.54 server (Windows).

Upvotes: 2

Views: 3429

Answers (3)

mr.frog
mr.frog

Reputation: 436

Just find nice way to do this.

SELECT (
 MATCH(`name`) AGAINST('+$keyword*' IN BOOLEAN MODE) + (`name`=$keyword)*2 
) as `rel`

Upvotes: 0

Jürgen Steinblock
Jürgen Steinblock

Reputation: 31743

I fixes the problem. Or at least I know what is happening:

MySQL ignores certain keywords in his fulltext index. If you have a column containing 255-381, mysql stores 255 and 381 in his full text index as single words. Or to be more precise, by default, these two values are even ignored because the minimum word length, specified by ft_min_word_len=? is 4. But if you set the value to 3 and serch for 255-381 the fulltext search engine searches for the exact phrase "255 381" and returns the row containing 255-381 because 255 followed by 381 is stored in the index.

That works for the example provided in the initial question but is not a perfect solution, because it does not apply for values with 2 chars or even one char followed by a minus. Let's assume I want to search for 25-5381. 25 is skipped because it is shorter than the ft_min_word_len threshold. So a search for 25-5381 is basically the same as a search for 5381 which would also return the rows containing 26-5381 5381 and 5381-12

Since I don't have the time to switch to a better full text search engine like sphynx or lucene and a LIKE query would be to slow, I suppose I will use a workaround:

I will add another column containing all entries from col1 and col2 but replace "-" chars with special string like MINUS and will perform a fulltext query on that column with 255MINUS381 which should do the trick.

Upvotes: 1

Leandro Bardelli
Leandro Bardelli

Reputation: 11598

Sorry but, did u try with REGEX (Regular Expressions) instead? And... why do u use double quotes? Just ' or ", not both. Even more, why you use text and varchar? I think you come from other SQL language, try just MySQL

Upvotes: 0

Related Questions