Reputation: 2555
I have a database with several fields :
word_id — INTEGER PRIMARY_KEY
word — TEXT
...
And ~150k rows. Since this is a dictionary, I'm searching for a word with mask 'search_string%'
using LIKE
. It used to work, taking 15ms to find matching rows. The table has an index for a field 'word'
.
I modified the table (some fields which are out of the scope) and it's taking 400ms to execute a query, so I understand that as it fails to use index now. Straightforward query with =
instead of LIKE
shows 10ms result. What's happening here?
Upvotes: 34
Views: 12594
Reputation: 382512
GLOB prefix*
workaround
This method does use the index, and so it is a good workaround for LIKE prefix%
that doesn't require modifying the table collation:
SELECT * FROM mytable WHERE mycolumn GLOB 'myprefix*'
From the docs:
The
GLOB
operator is similar toLIKE
but uses the Unix file globbing syntax for its wildcards. Also,GLOB
is case sensitive, unlikeLIKE
.
See also: Can an index on a text column speed up prefix based LIKE queries?
Ask SQLite to suggest an index for us
We can also ask for SQLite to automatically suggest the correct missing index type that we would need for each query by using the options mentioned at SQLite identify missing index. First we setup a test table:
sqlite3 tmp.sqlite 'CREATE TABLE mytable (mycolumn TEXT)'
Then e.g. with .expert
we can check which index would be needed to accelerate a LIKE
prefix with:
sqlite3 tmp.sqlite -cmd .expert \
-cmd "SELECT * FROM mytable WHERE mycolumn LIKE 'myprefix%'" ''
Output:
CREATE INDEX mytable_idx_79a75761 ON mytable(mycolumn COLLATE NOCASE);
SEARCH mytable USING COVERING INDEX mytable_idx_79a75761 (mycolumn>? AND mycolumn<?)
so we see that it recommends a COLLATE NOCASE
index in this case. And for GLOB
:
sqlite3 tmp.sqlite -cmd .expert \
-cmd "SELECT * FROM mytable WHERE mycolumn GLOB 'myprefix*'" ''
gives:
CREATE INDEX mytable_idx_234df46c ON mytable(mycolumn);
SEARCH mytable USING COVERING INDEX mytable_idx_234df46c (mycolumn>? AND mycolumn<?)
so it just recommended a regular index in that case.
Tested on Sqlite 3.40.1, Ubuntu 23.04 on a TEXT
column of a large-ish database I have lying around with 750 M rows. Query results are almost instantaneous.
Related:
Upvotes: 8
Reputation: 49034
An index cannot safely be used in this case. A naive implementation would transform this:
... WHERE word LIKE 'search_string%'
into
... WHERE word >= 'search_string' AND word < 'search_strinh'
by incrementing the last character of the search string. The greater-than and less-than operators can use an index, where LIKE cannot.
Unfortunately, that won't work in the general case. The LIKE
operator is case-insensitive, which means that 'a' LIKE 'A'
is true. The above transformation would break any search string with capitalized letters.
In some cases, however, you know that case sensitivity is irrelevant for a particular column, and the above transformation is safe. In this case, you have two options.
NOCASE
collating sequence on the index that covers this particular field.LIKE
operator program-wide by running PRAGMA case_sensitive_like = ON;
Either of these behaviors will enable SQLite to transparently do the above transformation for you; you just keep using LIKE
as always, and SQLite will rewrite the underlying query to use the index.
You can read more about "The LIKE Optimization" on the SQLite Query Optimizer Overview page.
Upvotes: 42