Reputation: 36351
I have a database of cities and states (about 43,000). I do a full text search on it like so:
select city, state, match(city, state_short, state) against (:q in boolean mode) as score
from zipcodes where
match(city, state_short, state) against (:q in boolean mode)
group by city, state order by score desc limit 6
When I replace :q
with a meaningful string, it works, but lets say I search for houston texas
, I would like the result to be first, but instead it is 3rd:
North Houston, Texas
South Houston, Texas
Houston, Texas
How can I make Houston, Texas
weigh more than the other 2? This obviously should be the same for other cities like this as well.
Edit
This works, any thoughts on it?
SELECT * FROM (
SELECT city, state, MATCH(city, state_short, state) AGAINST (:q IN BOOLEAN MODE) as score
FROM zipcodes
WHERE MATCH(city, state_short, state) AGAINST (:q IN BOOLEAN MODE)
GROUP BY city, state
ORDER BY score DESC LIMIT 6
) AS tbl
ORDER BY score DESC, LENGTH(city)
Upvotes: 0
Views: 124
Reputation: 8846
Your new query may work, but it is entirely circumstantial. Instead of doing ORDER BY LENGTH(city)
, something like ORDER BY ABS(LENGTH(:q) - (LENGTH(city) + LENGTH(state)))
would be better. This isn't perfect, but it should be better since anything having the same length as the input and a high score is probably what you're looking for. The final query would look something like this:
SELECT city, state, MATCH(city, state_short, state) AGAINST (:q IN BOOLEAN MODE) AS score
FROM zipcodes
WHERE MATCH(city, state_short, state) AGAINST (:q IN BOOLEAN MODE)
GROUP BY city, state
ORDER BY score DESC, ABS(LENGTH(:q) - (LENGTH(city) + LENGTH(state))) DESC LIMIT 6
I moved the new ORDER BY
clause into the main query to remove the subquery. This should yield the same (or possibly more accurate) results.
The Levenshtein Distance would probably be a more accurate measure for this, but there is no native implementation of it in MySQL. This post has more information about a MySQL implementation of a Levenshtein Distance function.
Upvotes: 1