Romain Linsolas
Romain Linsolas

Reputation: 81627

How to simply check for the proximity of two words?

I want to check that 2 words are very close each others. My need is really simple: we allow the user to execute an action by answering to an email, and the user should answer with one word (APPROVED, REFUSED, etc.). The list of possible action is really short. Now, I have to parse this answer, but my comparison has to be "typo-safe", i.e. if the user input is aproved or apporved for example, it should be ok.

Of course I can create my own almost-ok words (["Approved", "Aproved", "Apporved", ...]) and compare the user input with each element of this array, but defining all possible typos is really boring...

I know that I can do that with Lucene, but it seems a little bit too much for my needs, and ideally I would like to have a method like WordUtils.proximity("Approved", userInput). In addition, a phonetical comparison is not mandatory in my case.

Is there a small library that can do that?

Upvotes: 4

Views: 966

Answers (1)

Aviram Segal
Aviram Segal

Reputation: 11120

You can use the Levenshtein distance of the strings to indicate how close they are.

I guess there are more string distance algorithms but I used this before and it worked for me.

Here is an implementation you can try Algorithm Implementation/Strings/Levenshtein distance

Also, you can use StringUtils#getLevenshteinDistance() from Apache Commons-Lang

Upvotes: 5

Related Questions