Get Off My Lawn
Get Off My Lawn

Reputation: 36351

MySQL fulltext: have a result weigh more

I have a database of cities and states (about 43,000). I do a full text search on it like so:

select city, state, match(city, state_short, state) against (:q in boolean mode) as score
from zipcodes where
match(city, state_short, state) against (:q in boolean mode)
group by city, state order by score desc limit 6

When I replace :q with a meaningful string, it works, but lets say I search for houston texas, I would like the result to be first, but instead it is 3rd:

How can I make Houston, Texas weigh more than the other 2? This obviously should be the same for other cities like this as well.

Edit

This works, any thoughts on it?

SELECT * FROM (
    SELECT city, state, MATCH(city, state_short, state) AGAINST (:q IN BOOLEAN MODE) as score
    FROM zipcodes
    WHERE MATCH(city, state_short, state) AGAINST (:q IN BOOLEAN MODE)
    GROUP BY city, state
    ORDER BY score DESC LIMIT 6
) AS tbl
ORDER BY score DESC, LENGTH(city)

Upvotes: 0

Views: 124

Answers (1)

G-Nugget
G-Nugget

Reputation: 8846

Your new query may work, but it is entirely circumstantial. Instead of doing ORDER BY LENGTH(city), something like ORDER BY ABS(LENGTH(:q) - (LENGTH(city) + LENGTH(state))) would be better. This isn't perfect, but it should be better since anything having the same length as the input and a high score is probably what you're looking for. The final query would look something like this:

SELECT city, state, MATCH(city, state_short, state) AGAINST (:q IN BOOLEAN MODE) AS score
FROM zipcodes
WHERE MATCH(city, state_short, state) AGAINST (:q IN BOOLEAN MODE)
GROUP BY city, state
ORDER BY score DESC, ABS(LENGTH(:q) - (LENGTH(city) + LENGTH(state))) DESC LIMIT 6

I moved the new ORDER BY clause into the main query to remove the subquery. This should yield the same (or possibly more accurate) results.

The Levenshtein Distance would probably be a more accurate measure for this, but there is no native implementation of it in MySQL. This post has more information about a MySQL implementation of a Levenshtein Distance function.

Upvotes: 1

Related Questions