Reputation: 654
Table:
CREATE TABLE `stores` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`slug` varchar(191) COLLATE utf8mb4_unicode_ci NOT NULL,
`value` json DEFAULT NULL,
`html` mediumtext COLLATE utf8mb4_unicode_ci
GENERATED ALWAYS AS (json_unquote(json_extract(`value`,'$.html')))
STORED,
PRIMARY KEY (`id`),
KEY `slug` (`slug`),
FULLTEXT KEY `html` (`html`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;
Query:
select id from `stores` where MATCH(stores.html) AGAINST ('forum*' IN BOOLEAN MODE) limit 20
Takes 0.14 seconds
Explain:
id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
---|---|---|---|---|---|---|---|---|---|---|---|
1 | SIMPLE | stores | NULL | fulltext | html | html | 0 | const | 1 | 100.00 | Using where; Ft_hints: no_ranking, limit = 20 |
When query:
select id from `stores` where stores.html like '%forum%' limit 20
Takes only 0.003 seconds
Explain:
id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
---|---|---|---|---|---|---|---|---|---|---|---|
1 | SIMPLE | stores | NULL | ALL | NULL | NULL | NULL | NULL | 134101 | 100.00 | Using where |
I remember, when I first implemented this virtual generated fields over json it appeared to be faster than like, but now after implementing it over all the fields I noticed the site got slower. So I started analyzing the simple queries and found out that fulltext is actually significantly slower!
When I add SQL_NO_CACHE after select it makes no difference.
What am I missing? Thanks
Upvotes: 0
Views: 471
Reputation: 142346
This query is abnormally fast:
select id from `stores`
where stores.html like '%forum%' limit 20
Because it only looked at enough rows to find 20 that had that string. I think you will find that this takes significantly longer since it will check every row:
select id from `stores`
where stores.html like '%non-existent-text%' limit 20
Another possible reason is that MATCH
found hundreds, maybe thousands, or rows before it got to doing the LIMIT
. So time this:
select id from `stores`
where MATCH(stores.html) AGAINST ('qwertyui' IN BOOLEAN MODE) limit 20
The bottom line is that you may need to live with such inconsistencies. I believe (without solid proof your your dataset) than MATCH
will usually be faster than LIKE
. Note the word "usually".
Upvotes: 0