How to implement Approximate_string_matching (fuzzy string searching) in Java / MySQL?

Question

I am developing webservice in Java using REST framework.

I am using MySQL 5.1 database as a backend.

I am performing search operation on one of my table say Stops using like pattern.

But now I want to perform "Approximate_string_matching (fuzzy string searching)" for above search. Consider an e.g. for 23 ST stop, user can provide search string 23rd station, 23rd, 23 station, 23rd ST etc.

For this Approximate_string_matching algorithm I found the link http://en.wikipedia.org/wiki/Approximate_string_matching

But I dont know how to implement it.

Please guys help me to implement Approximate_string_matching algorithm in Java / MySQL?

Thank you in advance.

npinti · Accepted Answer

One thing you might want to look into would be the Levenshtein Distance Algorithm:

Levenshtein distance is a string metric for measuring the difference between two sequences.

The Apache Commons Lang has an implementation of this readily available. You could use the getLevenshteinDistance(CharSequence s, CharSequence t, int threshold) to get the strings which are approximately equal to the given string. The threshold would come in handy so that you would be able to discard words which are a certain distance away from your source word, thus avoiding unneeded computation.

A better approach would be to use the Levenshtein function provided by MySQL iteself. A simple example of how to execute can be seen here.

How to implement Approximate_string_matching (fuzzy string searching) in Java / MySQL?

Answers (2)

Related Questions