Thomas Bratt
Thomas Bratt

Reputation: 51982

Best match between two strings when the order or number of times a word appears is not important?

What is the best algorithm to match or compute the distance between two strings in C# when the order or number of times a word appears is not important?

Best means:

Related questions:

Some notes:

Upvotes: 2

Views: 1293

Answers (2)

Marwan
Marwan

Reputation: 81

Seach for a method called "Double Metaphone" which I beleive for word per word comparision it is the best available. Counts for different languages as well! queit amazing.

If comparing string maybe you can use this along with a cosine similarity. will yeild perfect results.

Upvotes: 1

Vinko Vrsalovic
Vinko Vrsalovic

Reputation: 340456

This looks like a canonical case to apply standard information retrieval algorithms. Cosine distance is what first comes to mind, but there might be better matches to your particular case. This is a good link to start digging on that route:

http://www.miislita.com/information-retrieval-tutorial/cosine-similarity-tutorial.html

Implementation example:

How do I calculate the cosine similarity of two vectors?

Upvotes: 1

Related Questions