Mallok
Mallok

Reputation: 23

Searching similar word on MySql

I've a problem, I need to search into MyISAM FullText field.

What I need to search?

ie. "avrir"
ie. "dividdos"
ie. "petacular"

What I have in my field's?

ie. "abrir"
ie. "divididos"
ie. "espectacular"

As you can see in the examples a user can put a wrong word in the search and I need to be able to select the best matching word.

Does anyone have an idea how can I do this?

I read about "sphinx" but I didn't find a solution for this question with.

Upvotes: 2

Views: 2496

Answers (3)

Johan
Johan

Reputation: 76753

You can download a levenshtein udf
For info about levenshtein see: http://en.wikipedia.org/wiki/Levenshtein_distance

At http://www.joshdrew.com/

Download location:
http://www.joshdrew.com/mysql_levenshtein_udf-1.0.tar.gz

This will allow you to install an UDF that will add a levenshtein function in MySQL that you can use in your queries. You can compile the UDF on linux using the instruction in the .gz file.

Here's how to install the UDF after you've compiled it.

>mv mysqllevenshtein.so /usr/lib

Then launch mysql

>mysql -uroot -pPASS

in mysql

mysql> use DATABASE

Database changed

mysql> CREATE FUNCTION levenshtein RETURNS INT SONAME 'mysqllevenshtein.so';

select levenshtein(word1,word2) as dist 
from book where ETC........... 
order by dist asc 
limit 0,10;

Upvotes: 1

Jeremy S.
Jeremy S.

Reputation: 6653

I am not sure if you can achieve what you want, with that what @Johan suggested. Anyways.

If you want to use sphinx the function you are searching for is called

stemming

see the Sphinx Doc for more details

http://sphinxsearch.com/docs/current.html

The following is stated in the documentation

  • Sphinx supports stemming (stemmers for English, Russian and Czech are built-in; and stemmers for French, Spanish, Portuguese, Italian, Romanian, German, Dutch, Swedish, Norwegian, Danish, Finnish, Hungarian, are available by building third party libstemmer library);

We currently use Sphinx at work and are really happy with it.

Takes some time to get used to it and understand it but once you got it its hell a lot faster that mysql full-text search.

Regards

Upvotes: 0

James C
James C

Reputation: 14169

MySQL's SOUNDEX() function might help you but it won't work magically with a blob of text in a FULLTEXT index.

Levenshtein distance might be something worth looking at too.

Upvotes: 0

Related Questions