michal.jakubeczy
michal.jakubeczy

Reputation: 9469

Soundex or Metaphone algorithm for typos in search term

In our search we need to return results which match searched term with Levensthein distance max of 2. The problem is that we need to apply Levensthein distance algorithm for every row in table which has millions of rows and then query is very slow.

SOUNDEX and Metaphone are great because they produce a hash which can be stored in database and compared to searched string. But they're phonetical based not 'typo' based. They work for some cases, but not for all.

I am aware that it does not seem to be possible to generate Levensthein hash and store it, because we do not know the search term.

So the question is whether is there any algorithm like SOUNDEX or Metaphone which is 'typo' oriented.

We use MariaDB database and PHP so implementing any should be feasible in PHP.

Upvotes: 2

Views: 869

Answers (1)

Aurelien M
Aurelien M

Reputation: 51

Did you consider applying Levenshtein only to records with a matching Soundex? It should speed up greatly your query

Upvotes: 0

Related Questions