Reputation: 313
Is it possible to see the actual content of a FULLTEXT INDEX
in MySQL? Or even query the index directly (like SELECT words FROM table.fulltext_index
and it would return all indexed words, just to have a example)?
For a regular b-tree index on a scalar field this is not neccessary, because they are just the same values, just ordered to speed up lookups (basically). But in a fulltext index there's happening so much "language magic" that it's very hard to know what actually lands in the index when doing an INSERT
to the table.
Upvotes: 0
Views: 803
Reputation: 142560
MyISAM has a way, but you are probably not using that version. It includes a way to get the words.
InnoDB builds an inverted index with the word + row number (probably PRIMARY KEY
) + column (if you are indexing multiple columns together) + byte offset into the column. I suspect that last item is stored as a commalist (or similar). That is, I envision the index being something like this:
CREATE TABLE ft_index_for_table_x (
word VARCHAR ...,
pk ..., -- possibly multiple columns
col ..., -- which column the word occurs in
offsets ..., -- list of byte offsets
PRIMARY KEY(word, pk, col)
) ENGINE=InnoDB -- but not really.
Also, note that there are a bunch of transient files; these seem to collect the index info before finally consolidating into the structure above.
I have not heard of a tool for inspecting the index.
This gets something, but I don't think it is quite the list of words:
strings FTS*.ibd |
awk '/^[a-z]+$/' |
sort |
uniq -c |
sort -nr
The last sort gets them with the most frequent first; remove to get them alphabetically.
Upvotes: 1