Reputation: 9322
I have a table with about 100,000 rows.
Each row contains a sentence, sentence fragment, or phrase.
I want to write a query that will find all rows that contain all of set of words, even in the words in the criteria are in a different order than the sentence.
For example, if my table looks like this:
id sentence
-- ---------------------------------------------------------------------------
1 How now brown cow
2 Alas, poor Yorick! I knew him
3 Call me Ishmael
4 A screaming comes across the sky
5 It was a bright cold day in April, and the clocks were striking thirteen
6 It was the best of times, it was the worst of times
7 You don't know about me without you have read a book
8 In the late summer of that year we lived in a house in a village
9 One summer afternoon Mrs. Oedipa Maas came home from a Tupperware party
10 It was a queer, sultry summer, the summer they electrocuted the Rosenbergs
My query criteria would be one or more words, in any particular order.
The result set should contain all of the sentences that contain all of the words.
For example, if the criteria is the was
, the results should include rows 5, 6, 10.
Ideally, I'd like to improve this so that the query only needs to include the start of a word. (Note that I want to allow users to only enter the start of a word, but never just the middle or end).
E.g., if the criteria is elect sul
, the result would include row 10.
Currently, here's how I'm doing this:
SELECT
id, sentence
WHERE
(sentence LIKE 'elect%' OR sentence LIKE '% elect%')
AND
(sentence LIKE 'sul%' OR sentence LIKE '% sul%')
This works (I think...) - it finds everything it should. However, it's very slow.
Is there a better way to do this?
For what it's worth - I have flexibility to redesign the table, or create additional "helper" tables.
E.g., I thought about creating a table that contains a row for every unique word and keys to each row of the sentence that includes it.
Also - the query needs to work in MySQL.
Many thanks in advance.
Upvotes: 2
Views: 726
Reputation: 1270463
Your method is fine. If you want to handle multiple words, you can do something like:
select s.id, s.sentence
from sentence s join
(select 'elect' as word union all
select 'sul' as word
) words
on s.sentence like concat(word, '%') or
s.sentence like concat('% ', word, '%')
group by s.id, s.sentence
having count(*) = (select count(*) from words)
This won't be faster (because you have the additional group by
). But it does provide a bit more flexibility.
By the way, have you looked into the full text search capabilities in MySQL?
Upvotes: 2