Michael Hedgpeth
Michael Hedgpeth

Reputation: 7862

How to determine which string in an array is most similar to a given string?

Given a string,

string name = "Michael";

I want to be able to evaluate which string in array is most similar:

string[] names = new[] { "John", "Adam", "Paul", "Mike", "John-Michael" };

I want to create a message for the user: "We couldn't find 'Michael', but 'John-Michael' is close. Is that what you meant?" How would I make this determination?

Upvotes: 3

Views: 804

Answers (2)

Dr. belisarius
Dr. belisarius

Reputation: 61016

Here you have the results for your example using the Levenshtein Distance:

EditDistance["Michael",#]&/@{"John","Adam","Paul","Mike","John-Michael"}
{6,6,5,4,5}  

Here you have the results using the Smith-Waterman similarity test

SmithWatermanSimilarity["Michael",#]&/@{"John","Adam","Paul","Mike","John-Michael"}
{0.,0.,0.,2.,7.} 

HTH!

Upvotes: 3

BrokenGlass
BrokenGlass

Reputation: 160892

This is usually done using the Edit distance / Levenshtein distance by comparing which word is the closest based on the number of deletions, additions or changes required to transform one word into the other.

There's an article providing you with a generic implementation for C# here.

Upvotes: 5

Related Questions