Reputation: 39496
I have a MySQL table in which a column contains string prefixes. For instance these prefixes could be top-level directories on an Unix file system:
my_table:
+---------+
| prefix |
+---------+
| /usr/ |
| /bin/ |
| /var/ |
| /lib/ |
+---------+
How can I write a query that efficiently finds all rows in this table where the value of the prefix column is the beginning of a given string?
For instance given the string '/usr/bin/cat' how can I write a query that finds the row containing '/usr/' which is the beginning of '/usr/bin/cat'.
My first guess is to use LIKE
this way:
SELECT * FROM my_table
WHERE '/usr/bin/cat' LIKE CONCAT(prefix, '%')
But I'm afraid this query won't be using the index I have on the prefix column.
I also came up with the following:
SELECT * FROM my_table
WHERE prefix <= '/usr/bin/cat' ORDER BY prefix DESC LIMIT 1
Which retrieves the prefix equal to or immediately preceding '/usr/bin/cat' in lexicographical order. I can then verify whether that prefix actually begins with '/usr/bin/cat' or not.
But that only works with a single row and I wonder if that's the optimal solution.
Edit: I used root directories as an example but I'd like to know if there's a way to deal with arbitrary strings as well. Perhaps these strings won't contain path separators or the prefix could be several level deep. Say: '/usr/lib'.
Edit: It seems that my second query is bogus. '/usr/' is smaller than '/usr/bin/cat' but so is '/usr/a'. That query is still much faster than a full table scan on a large table but to make it work I have to fetch more rows and go through them until I find the first actual prefix.
So it seems an index can help in this kind of prefix search but I still don't know the best way to take advantage of it.
Upvotes: 2
Views: 519
Reputation: 300
-- situation: We do not know where the string can be cut.
-- But we must know maximal length of the prefix.
-- EDIT: It would also help to know the minimal length of prefix - to eliminate lots of false positives that we do not want to find. (min = 2 characters).
-- This will definitely use the index: in this example it is max.8 characters. x = 8 -- in your application, just try to generate such SQL query: -- No full table scan,just (x - min +1) times uses the index. Hopefully this will be FAST enough! :)
SELECT * FROM my_table WHERE prefix = '/u'
UNION
SELECT * FROM my_table WHERE prefix = '/us'
UNION
SELECT * FROM my_table WHERE prefix = '/usr'
UNION
SELECT * FROM my_table WHERE prefix = '/usr/'
UNION
SELECT * FROM my_table WHERE prefix = '/usr/b'
UNION
SELECT * FROM my_table WHERE prefix = '/usr/bi'
UNION
SELECT * FROM my_table WHERE prefix = '/usr/bin';
Upvotes: 1
Reputation: 53830
Replace ?
with your string.
SELECT *
FROM my_table
WHERE prefix = LEFT(?, LOCATE('/', ?, '2'))
You're right in that you want to keep the column on the left side of the expression in order to use the index on your WHERE clause. You can do some manipulation on the string to get the constant to compare to.
Alternatively, can you truncate the string in your application?
Edit
Just one solution of many if you want it to work for any prefix:
SELECT *
FROM my_table
WHERE prefix = LEFT(?, LENGTH(prefix))
However, since the right side of the WHERE clause is not a constant, but a function on the column, MySQL will have to scan every row. It won't use the index on prefix to satisfy the WHERE clause.
Ideally, you want a column on the left side and a constant on the right.
Upvotes: 1