Boardy
Boardy

Reputation: 36217

MATCH AGAINST SQL Query on MariaDB not working as expected

I am working on a project where I am using MySQL MATCH AGAINST. I've used before in another project without any issue and utilising the same base code, except I'm having some odd behaviour. The only real difference is that I am using MariaDB instead of MySQL.

Below is how my table is defined

CREATE TABLE `temp_logs` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `LogID` int(11) NOT NULL,
  `LogReceived` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
  `Host` int(11) NOT NULL,
  `LogName` varchar(100) NOT NULL,
  `LogLine` text NOT NULL,
  PRIMARY KEY (`id`),
  KEY `IDXLogID` (`LogID`,`LogReceived`,`Host`),
  FULLTEXT KEY `IDXLogLine` (`LogLine`)
) ENGINE=MyISAM AUTO_INCREMENT=5838772 DEFAULT CHARSET=latin1;

One of the columns that I am doing the full text search against contains the following:

19/06/2019 19:01:18: Took 0 seconds to read lines for log 'Log Agent'

If I do the query as follows (LogLine is the column with the full text search):

SELECT * FROM log_agent.temp_logs WHERE MATCH(LogLine) AGAINST ('+Log' IN BOOLEAN MODE);

But the above query returns no results, even though as shown above the column value contains Log. If I try changing +Log to be +seconds it then returns the row so why does it find seconds but not Log, again if I change +Log for +Agent rows are returned so their doesn't seem to be any rhyme or reason for what its doing.

I've tried removing the IN BOOLEAN MODE as I didn't need this previously but makes no difference.

Upvotes: 3

Views: 730

Answers (1)

Rick James
Rick James

Reputation: 142346

There are 3 caveats in using MyISAM's FULLTEXT:

  • Words that occur in more than half the rows are not indexed.

  • Words shorter than ft_min_word_len are not indexed.

  • Words in the "stop word" list are not indexed.

When filtering on things that FULLTEXT prefers to ignore, this trick is useful:

WHERE MATCH(`LogLine`) AGAINST ('...' IN BOOLEAN MODE)  -- things that FT can do
  AND `LogLine` LIKE "..."  -- (or NOT LIKE or RLIKE or NOT RLIKE, as needed)

This will be reasonably efficient because it will first do the FT test, which will find only a few rows. Then it will go to the extra effort on those rows.

Upvotes: 2

Related Questions