Nyfiken Gul
Nyfiken Gul

Reputation: 684

String-matching first n letters of two strings

So for a problem I'm facing I would like to know how long a sequence (starting from index 0) two strings are 'the same' - I think it'd be clearer to just give an example;

Is there any more (time-)efficient way to go about this than to just iterate over the two words? Could I make use of some built-in method of some sort? (For my task I want to avoid importing any custom libs)

Upvotes: 1

Views: 2117

Answers (4)

Sachin Chauhan
Sachin Chauhan

Reputation: 366

I think the fastest approach would be to use Binaray Search, which will give you O(logn) complexity instead of O(n). Here n is the length of smallest string.

The approach is simple in binary search. Look for similarity end for the index character in both the strings. For example if i is your index then check i+1 for dis-similarity character where character at i index is similar. And if that is the case break, return i as your answer. Or else keep on searching in sub-scope.

Edit

Adding function for better understanding.

int lengthOfFirstSimilarCharacters(String str1, String str2) {
    int strlen1 = str1.length();
    int strlen2 = str2.length();
    if(strlen1 > strlen2){
        return lengthOfFirstSimilarCharacters(str2,str1);
    }
    int i = 0;
    int j = strlen1-1;
    while(i<=j){
        int mid = i + (j-i)/2;
        if(str1.charAt(mid) == str2.charAt(mid)) {
            if(mid+1<strlen1 && str1.charAt(mid+1) != str2.charAt(mid+1)){
                return mid+1;
            }
            i = mid+1;
        }else{
            j = mid-1;
        }
    }
    return i;
}

Upvotes: 4

WillD
WillD

Reputation: 875

Using Streams

    String s1 = "Yellow";
    String s2 = "Yelling";
    int limit = (s1.length() > s2.length() ? s2.length() : s1.length()) - 1;
    int ret = IntStream.range(0, limit)
                .filter(i -> s1.charAt(i) != s2.charAt(i))
                .findFirst().orElse(-1);
    //-1 if the Strings are the same.

Upvotes: 1

st2rseeker
st2rseeker

Reputation: 483

Correction:

The answer by Sachin Chauhan is indeed correct and better at runtime (i.e. using binary search to search for the first difference).

I will leave my answer to allow for a simpler solution programmer-time, for the cases where the length is of no great influence (i.e. relatively short strings), but a simple solution would be preferable.

Here is the original answer:

As it's a simple loop, I doubt any inbuilt method will be much of a "programmer"-time improvement (and definitely not much of run-time improvement worth to mention).

For the record, I know of no such Java method (perhaps some external library, but you've stated you'd prefer to avoid them).

Reference code would be something along these lines, I'd imagine:

public int longestCommonPrefixLength(String s1, String s2) {

    if (s1 == null || s1.length() == 0 || s2 == null || s2.length() == 0) {
        return 0;
    }

    int commonPrefixLength = 0;

    for (int i = 0; i < Math.min(s1.length(), s2.length()); i++) {
        if (s1.charAt(i) == s2.charAt(i)) {
            commonPrefixLength++;
        } else {
            break;
        }
    }

    return commonPrefixLength;
}

As we see, with all the verbosity of Java and my "clarity" style, it's still just 18 lines of code. :)

Relaxing some clarity, you can even shorten the for to:

for (int i = 0; i < Math.min(s1.length(), s2.length()) && s1.charAt(i) == s2.charAt(i); i++, commonPrefixLength++);

for 6 lines less.

To take it to the (correct) extreme:

public int longestCommonPrefixLength2(String s1, String s2) {
    if (s1 == null || s1.length() == 0 || s2 == null || s2.length() == 0) return 0;
    int i = 0;
    for (; i < Math.min(s1.length(), s2.length()) && s1.charAt(i) == s2.charAt(i); i++);
    return i;
}

6 LOC :)

Something curious, by the way:

String class has boolean regionMatches(int toffset, String other, int ooffset, int len) method (which does internally pretty much the above up to a given len) - you could also iteratively increase len until it no longer returns true, but that would not be anywhere near same efficiency, of course.

Upvotes: 1

ABHIJITH GM
ABHIJITH GM

Reputation: 134

You dont have to iterate through both texts. Iterate through the smaller one and compare character at same index. break as and when you find a mismatch

String a ="Yellow";
String b= "Yelling";
String smaller = (a.length < b.length) ? a:b;
int ret =0;
for (index based on smaller ){
  compare character using charAt and if matching ret++, else break;
}
return ret;

//use charAt along with equalsIgnoreCase ifu want it to be case insensitive. String.valueOf(a.charAt(index)).equalsIgnoreCase(String.valueOf(b.charAt(index)))

Upvotes: 1

Related Questions