Mohammad Javad Noori
Mohammad Javad Noori

Reputation: 1217

Getting true in equal() when more characters look like

How to recognize two strings when most characters are similar

I want get true in this samples

"Hello Wolrld" == "HelloWorld"

OR

"hello world!!" == "helloworld"

I know that these are not equal, But since most of the characters are the same, it is enough for me

Thanks in advance

Upvotes: 2

Views: 72

Answers (4)

SahdevRajput74
SahdevRajput74

Reputation: 752

This give true if 80% words are similer Try this

            String str1 = "Hello world";
            String str2 = "Helloworld!!";

            char[] charArray;

            int per = 0;
            int c = 0;
            if (str1.length() > str2.length()) {

                per = (str1.length() * 80) / 100; // 80% per match logic

                charArray = str1.toCharArray();

                for (int i = 0; i < str1.length(); i++) {

                    String chars = String.valueOf(charArray[i]);

                    if (str2.contains(chars)) {
                        c++;
                    }
                }
            } else {
                per = (str1.length() * 80) / 100; // 80% per match logic

                charArray = str2.toCharArray();

                for (int i = 0; i < str2.length(); i++) {

                    String chars = String.valueOf(charArray[i]);

                    if (str1.contains(chars)) {
                        c++;
                    }
                }
            }
            if (c >= per) {
                Toast.makeText(getApplicationContext(), "true", 0).show();
            } else {
                Toast.makeText(getApplicationContext(), "false", 0).show();
            }

Upvotes: 0

Kristof U.
Kristof U.

Reputation: 1281

You can compute the Levenshtein distance of the two strings (see for example this C# implementation) and then define a threshold up to which you consider the strings to be "equal".

What a reasonable threshold is depends on your requirements. Probably, defining the predicate as d <= a * Math.Min(string1.Length, string2.Length) should work, where d is the Levenshtein distance of the strings and a is a factor of "similarity" between 0 and 1. In your examples a==0.3 should work.

Upvotes: 2

John Wu
John Wu

Reputation: 52240

If you're looking for a very basic check, you can enumerate over the characters using Zip to compare them, count the matching letters, and report true if the number of matches are above a certain threshold. This won't capture it if one string is a shifted version of the other; it'll only catch letters in common at the same index.

public static class ExtensionMethods
{
    public static bool FuzzyCompare(this string lhs, string rhs, float ratioRequired)
    {
        var matchingLetters =  lhs.Zip
            ( 
                rhs, 
                (a,b) => a == b ? 1 : 0
            )
            .Sum();
        return (float)matchingLetters / (float)lhs.Length > ratioRequired;
    }
}

To compare two strings to see if they match on at least half of the letters, pass a ratioRequired of 0.5.

    public static void Main()
    {

        var a = "ABCD";
        var b = "ABCDEFGHI";

        Console.WriteLine( a.FuzzyCompare(b, 0.5F) );
    }

Output:

True

Code on DotNetFiddle

Upvotes: 0

Jophy job
Jophy job

Reputation: 1964

Use this

Regex.Replace(textBox1.Text, @"[^0-9a-zA-Z]+", "").ToLower() == your string in lower case

Upvotes: 2

Related Questions