Reputation: 1333
I'm using MySQL 8 and I've got a table called records
for which I added the following index in order to perform text search on it:
CREATE FULLTEXT INDEX all_records_idx ON records (`title`, `label`, `description`, `catalog_number`,`barcode`);
Now, I've got the following data (copied it as JSON):
[
{
"barcode": "79027066021",
"title": "White Blood Cells",
"slug": "White-blood-cells",
"image": null,
"description": "rev",
"artist_id": 3,
"genre_id": 1,
"num_of_records": 1,
"original_release_year": 2001,
"original_release_date": null,
"reissue_release_date": null,
"label": null,
"sleeve_side_color": "ff1500",
"vinyl_color": "ff1500",
"size": 12,
"barcode_in_record": null,
"catalog_number": null,
"owned_copies": 1,
"active": 1,
"created_at": "2023-06-11 21:32:17.000",
"updated_at": null,
"deleted_at": null
},
{
"barcode": "889854036119",
"title": "Nilsson Schmilsson",
"slug": "nilsson-schmilsson",
"image": null,
"description": null,
"artist_id": 6,
"genre_id": 1,
"num_of_records": 1,
"original_release_year": 1971,
"original_release_date": null,
"reissue_release_date": null,
"label": null,
"sleeve_side_color": "490d57",
"vinyl_color": "000000",
"size": 12,
"barcode_in_record": null,
"catalog_number": null,
"owned_copies": 1,
"active": 1,
"created_at": "2023-06-11 21:32:17.000",
"updated_at": null,
"deleted_at": null
},
{
"barcode": "C10777774644617",
"title": "Abbey Road",
"slug": "abbey-road",
"image": null,
"description": null,
"artist_id": 2,
"genre_id": 1,
"num_of_records": 1,
"original_release_year": 1969,
"original_release_date": null,
"reissue_release_date": null,
"label": null,
"sleeve_side_color": "46518f",
"vinyl_color": "000000",
"size": 12,
"barcode_in_record": null,
"catalog_number": null,
"owned_copies": 1,
"active": 1,
"created_at": "2023-06-11 21:32:17.000",
"updated_at": null,
"deleted_at": null
},
{
"barcode": "C10777774644618",
"title": "Revolver",
"slug": "revolver",
"image": null,
"description": null,
"artist_id": 2,
"genre_id": 1,
"num_of_records": 1,
"original_release_year": 1966,
"original_release_date": null,
"reissue_release_date": null,
"label": null,
"sleeve_side_color": "46518f",
"vinyl_color": "000000",
"size": 12,
"barcode_in_record": null,
"catalog_number": null,
"owned_copies": 1,
"active": 1,
"created_at": "2023-06-11 21:32:17.000",
"updated_at": null,
"deleted_at": null
},
{
"barcode": "C10777774644619",
"title": "arevolat",
"slug": "aurevolat",
"image": null,
"description": "this is a test",
"artist_id": 2,
"genre_id": 1,
"num_of_records": 1,
"original_release_year": 1999,
"original_release_date": null,
"reissue_release_date": null,
"label": null,
"sleeve_side_color": null,
"vinyl_color": null,
"size": 12,
"barcode_in_record": null,
"catalog_number": null,
"owned_copies": 1,
"active": 1,
"created_at": null,
"updated_at": null,
"deleted_at": null
}
]
So, there are currently 5 records in the table. In order to test the search functionality, I ran the following query:
SELECT * FROM records WHERE MATCH(`title`, `label`, `description`, `catalog_number`,`barcode`)
AGAINST ('*rev*' IN BOOLEAN MODE)
I was expecting to get 3 records back:
But I'm only getting back the first two records:
This one is not present:
Even though I'm using *
as part of the search.
DDL:
CREATE TABLE `records` (
`barcode` varchar(48) NOT NULL,
`title` varchar(160) NOT NULL,
`slug` varchar(250) NOT NULL,
`image` varchar(191) DEFAULT NULL,
`artist_id` bigint unsigned DEFAULT NULL,
`genre_id` bigint unsigned DEFAULT NULL,
`num_of_records` bigint NOT NULL,
`original_release_date` date DEFAULT NULL,
`reissue_release_date` date DEFAULT NULL,
`label` longtext,
`sleeve_side_color` varchar(191) DEFAULT NULL,
`vinyl_color` longtext,
`size` bigint DEFAULT NULL,
`barcode_in_record` tinyint(1) DEFAULT NULL,
`catalog_number` longtext,
`owned_copies` bigint DEFAULT '1',
`active` tinyint(1) DEFAULT '1',
`created_at` datetime(3) DEFAULT NULL,
`updated_at` datetime(3) DEFAULT NULL,
`deleted_at` datetime(3) DEFAULT NULL,
`original_release_year` bigint NOT NULL,
`description` text,
PRIMARY KEY (`barcode`),
UNIQUE KEY `barcode` (`barcode`),
UNIQUE KEY `slug` (`slug`),
KEY `fk_records_artist` (`artist_id`),
KEY `fk_records_genre` (`genre_id`),
FULLTEXT KEY `all_records_idx` (`title`,`label`,`description`,`catalog_number`,`barcode`),
CONSTRAINT `fk_records_artist` FOREIGN KEY (`artist_id`) REFERENCES `artists` (`id`),
CONSTRAINT `fk_records_genre` FOREIGN KEY (`genre_id`) REFERENCES `genres` (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci;
Any idea what's going on?
Thanks
Upvotes: 0
Views: 595
Reputation: 1035
12.9.5 Full-Text Restrictions
The '%' character is not a supported wildcard character for full-text searches.
https://dev.mysql.com/doc/refman/5.7/en/fulltext-restrictions.html
Asterisk *
The asterisk serves as the truncation (or wildcard) operator. Unlike the other operators, it is appended to the word to be affected. Words match if they begin with the word preceding the * operator. [...]
The wildcarded word is considered as a prefix that must be present at the start of one or more words.[...]
The following examples demonstrate some search strings that use boolean full-text operators: [...]
'apple*'
Find rows that contain words such as “apple”, “apples”, “applesauce”, or “applet”.
https://dev.mysql.com/doc/refman/5.7/en/fulltext-boolean.html
Upvotes: 1
Reputation: 1274
In MySQL, the wildcard character is %
rather than *
.
Try AGAINST ('%rev%' IN BOOLEAN MODE)
Sometimes, *
acts as a modifier to match 'zero or more instances of the thing preceding it.'
See https://dev.mysql.com/doc/refman/8.0/en/pattern-matching.html
However, the above Pattern Matching applies to the extended regular expressions (such as when using RLIKE
).
For a Boolean Full-text Search, the *
is a truncation operator. The * at the end of your expression means that rev
must be present at the start of one or more words. This explains the missing record in your results.
I suspect the first *
is ignored.
More info: https://dev.mysql.com/doc/refman/8.0/en/fulltext-boolean.html
Upvotes: 1
Reputation: 142518
"%", "*", and other punctuation marks have specific and different meanings for AGAINST
, LIKE
, and RLIKE
.
MATCH(title, ...) AGAINST('*rev*') -- ignores the leading punctuation mark
title LIKE '%rev%' -- finds the desired 3 rows, but only in `title`; and slow
title RLIKE 'rev' -- finds the desired 3 rows, but only in `title`; and slow
title LIKE '%rev%' OR label LIKE '%rev%' OR ... -- even slower
FULLTEXT, when appropriate, is almost always faster than LIKE
or RLIKE
; often much faster. I suggest you change the requirements and/or expectations. As you see from above, there is no syntax that is both simple and fast.
Upvotes: 2