Reputation: 1217
How to recognize two strings when most characters are similar
I want get true
in this samples
"Hello Wolrld" == "HelloWorld"
OR
"hello world!!" == "helloworld"
I know that these are not equal, But since most of the characters are the same, it is enough for me
Thanks in advance
Upvotes: 2
Views: 72
Reputation: 752
This give true if 80% words are similer Try this
String str1 = "Hello world";
String str2 = "Helloworld!!";
char[] charArray;
int per = 0;
int c = 0;
if (str1.length() > str2.length()) {
per = (str1.length() * 80) / 100; // 80% per match logic
charArray = str1.toCharArray();
for (int i = 0; i < str1.length(); i++) {
String chars = String.valueOf(charArray[i]);
if (str2.contains(chars)) {
c++;
}
}
} else {
per = (str1.length() * 80) / 100; // 80% per match logic
charArray = str2.toCharArray();
for (int i = 0; i < str2.length(); i++) {
String chars = String.valueOf(charArray[i]);
if (str1.contains(chars)) {
c++;
}
}
}
if (c >= per) {
Toast.makeText(getApplicationContext(), "true", 0).show();
} else {
Toast.makeText(getApplicationContext(), "false", 0).show();
}
Upvotes: 0
Reputation: 1281
You can compute the Levenshtein distance of the two strings (see for example this C# implementation) and then define a threshold up to which you consider the strings to be "equal".
What a reasonable threshold is depends on your requirements. Probably, defining the predicate as d <= a * Math.Min(string1.Length, string2.Length)
should work, where d
is the Levenshtein distance of the strings and a
is a factor of "similarity" between 0 and 1. In your examples a==0.3
should work.
Upvotes: 2
Reputation: 52240
If you're looking for a very basic check, you can enumerate over the characters using Zip
to compare them, count the matching letters, and report true if the number of matches are above a certain threshold. This won't capture it if one string is a shifted version of the other; it'll only catch letters in common at the same index.
public static class ExtensionMethods
{
public static bool FuzzyCompare(this string lhs, string rhs, float ratioRequired)
{
var matchingLetters = lhs.Zip
(
rhs,
(a,b) => a == b ? 1 : 0
)
.Sum();
return (float)matchingLetters / (float)lhs.Length > ratioRequired;
}
}
To compare two strings to see if they match on at least half of the letters, pass a ratioRequired
of 0.5.
public static void Main()
{
var a = "ABCD";
var b = "ABCDEFGHI";
Console.WriteLine( a.FuzzyCompare(b, 0.5F) );
}
Output:
True
Upvotes: 0
Reputation: 1964
Use this
Regex.Replace(textBox1.Text, @"[^0-9a-zA-Z]+", "").ToLower() == your string in lower case
Upvotes: 2