dark_shadow
dark_shadow

Reputation: 3573

How to find the longest substring with no repeated characters?

I want an algorithm to find the longest substring of characters in a given string containing no repeating characters. I can think of an O(n*n) algorithm which considers all the substrings of a given string and calculates the number of non-repeating characters. For example, consider the string "AABGAKG" in which the longest substring of unique characters is 5 characters long which corresponds to BGAKG.

Can anyone suggest a better way to do it ?

Thanks

Edit: I think I'm not able to explain my question properly to others. You can have repeating characters in a substring (It's not that we need all distinct characters in a substring which geeksforgeeks solution does). The thing which I have to find is maximum no of non-repeating characters in any substring (it may be a case that some characters are repeated).

for eg, say string is AABGAKGIMN then BGAKGIMN is the solution.

Upvotes: 4

Views: 5954

Answers (9)

AFetter
AFetter

Reputation: 3584

Let me contribute a little as well. I have this solution with complexity will be O(N). The algorithm’s space complexity will be O(K), where K is the number of distinct characters in the input string.

public static int NoRepeatSubstring(string str)
    {
        int start = 0;
        int maxLen = 0;
        Dictionary<char, int> dic = new Dictionary<char, int>();
        for (int i = 0; i < str.Length; i++)
        {
            char rightChar = str[i];
            // if the map already contains the 'rightChar', shrink the window from the beginning so that
            // we have only one occurrence of 'rightChar'
            if (dic.ContainsKey(rightChar))
            {
                // this is tricky; in the current window, we will not have any 'rightChar' after its previous index
                // and if 'start' is already ahead of the last index of 'rightChar', we'll keep 'windowStart'
                start = Math.Max(start, dic[rightChar] + 1);
            }

            if (dic.ContainsKey(str[i]))
                dic[str[i]] = i;
            else
                dic.Add(str[i], i);

            maxLen = Math.Max(maxLen, i - start + 1);

        }
        return maxLen;
    }

And here some Unit Tests:

Assert.Equal(3, SlideWindow.NoRepeatSubstring("aabccbb"));
Assert.Equal(2, SlideWindow.NoRepeatSubstring("abbbb"));
Assert.Equal(3, SlideWindow.NoRepeatSubstring("abccde"));

Upvotes: 0

Akanksha Gupta
Akanksha Gupta

Reputation: 1

  //Given a string ,find the longest sub-string with all distinct characters in it.If there are multiple such strings,print them all.
  #include<iostream>
  #include<cstring>
  #include<array>

  using namespace std;

 //for a string with all small letters
 //for capital letters use 65 instead of 97

 int main()
 {

  array<int ,26> count ;

  array<string,26>largest;

  for(int i = 0 ;i <26;i++)
  count[i]=0;

  string s = "abcdefghijrrstqrstuvwxyzprr";
  string out = "";

  int k = 0,max=0;

  for(int i = 0 ; i < s.size() ; i++)
     { 
         if(count[s[i] - 97]==1)
           {  
              int loc = out.find(s[i]);

              for(int j=0;j<=loc;j++)   count[out[j] - 97]=0;

              if(out.size() > max) 
                {   
                   max = out.size();
                   k=1;
                   largest[0] = out;
                 }
             else if(out.size()==max) largest[k++]=out;

             out.assign(out,loc+1,out.size()-loc-1);
           }
         out = out + s[i];
         count[s[i] - 97]++;
      }

   for(int i=0;i<k;i++)  cout<<largest[i] << endl;
   //output will be
   // abcdefghijr
   // qrstuvwxyzp

  }

Upvotes: 0

Jiaji Li
Jiaji Li

Reputation: 1269

Pretty tricky question, I give you an O(n) solution based on C#.

public string MaxSubStringKUniqueChars(string source, int k) {

if (string.IsNullOrEmpty(source) || k > source.Length) return string.Empty;

        var start = 0;
        var ret = string.Empty;
        IDictionary<char, int> dict = new Dictionary<char, int>();

        for (var i = 0; i < source.Length; i++)
        {
            if (dict.ContainsKey(source[i]))
            {
                dict[source[i]] = 1 + dict[source[i]];
            }
            else
            {
                dict[source[i]] = 1;
            }

            if (dict.Count == k + 1)
            {
                if (i - start > ret.Length)
                {
                    ret = source.Substring(start, i - start);
                }

                while (dict.Count > k)
                {
                    int count = dict[source[start]];
                    if (count == 1)
                    {
                        dict.Remove(source[start]);
                    }
                    else
                    {
                        dict[source[start]] = dict[source[start]] - 1;
                    }

                    start++;
                }
            }

        }
        //just for edge case like "aabbcceee", should return "cceee"
        if (dict.Count == k && source.Length - start > ret.Length)
        {
            return source.Substring(start, source.Length - start);
        }

        return ret;
    }

`

//This is the test case.

    public void TestMethod1()
    {
        var ret = Item001.MaxSubStringKUniqueChars("aabcd", 2);
        Assert.AreEqual("aab", ret);

        ret = Item001.MaxSubStringKUniqueChars("aabbccddeee", 2);
        Assert.AreEqual("ddeee", ret);

        ret = Item001.MaxSubStringKUniqueChars("abccccccccaaddddeeee", 3);
        Assert.AreEqual("ccccccccaadddd", ret);

        ret = Item001.MaxSubStringKUniqueChars("ababcdcdedddde", 2);
        Assert.AreEqual("dedddde", ret);
    }

Upvotes: 2

Alan Ford
Alan Ford

Reputation: 375

Here is the solution in C#. I tested in in Visual studio 2012 and it works

public static int LongestSubstNonrepChar(string str) {

        int curSize = 0;
        int maxSize = 0;
        int end = 0;
        bool[] present = new bool[256];


        for (int start = 0; start < str.Length; start++) {
            end = start;
            while (end < str.Length) {
                if (!present[str[end]] && end < str.Length)
                {
                    curSize++;
                    present[str[end]] = true;
                    end++;
                }
                else
                    break;
            }
            if (curSize > maxSize) {
                maxSize = curSize;
            }
            //reset current size and the set all letter to false
            curSize = 0;
            for (int i = 0; i < present.Length; i++)
                present[i] = false;
        }

            return maxSize;
    }

Upvotes: 3

Hao Q
Hao Q

Reputation: 1

public static int longestNonDupSubstring(char[] str) {

    int maxCount = 0;
    int count = 0;
    int maxEnd = 0;

    for(int i=1;i < str.length;i++) {

        if(str[i] != str[i-1]) {
            count++;
        }

        if (str[i] == str[i-1]) {
            if(maxCount<count) {
                maxCount = count;
                maxEnd = i;
            }
            count = 0;
        }

        if ( i!=str.length-1 && str[i] == str[i+1]) {
            if(maxCount<count) {
                maxCount = count - 1;
                maxEnd = i-1;
            }
            count = 0;
        }
    }

    int startPos = maxEnd - maxCount + 1;
    for(int i = 0; i < maxCount; i++) {

        System.out.print(str[startPos+i]);
    }

    return maxCount;
}

Upvotes: 0

frankyym
frankyym

Reputation: 171

for every start = 0 ... (n-1), try to expend end to the right-most position.

keep a bool array used[26] to remember if any character is already used. suppose currently we finished (start, end)

for start+1,

  • first clear by set: used[str[start]] = false;
  • while ((end+1 < n) && (!used[str[end+1]])) { used[str[end+1]]=true; ++end;}

now we have check new (start, end). Total Complexity is O(N).

Upvotes: 3

Colonel Panic
Colonel Panic

Reputation: 137544

Let s be the given string, and n its length.

Define f(i) to be the longest [contiguous] substring of s ending at s[i] with distinct letters. That's unique and well-defined.

Compute f(i) for each i. It's easy to deduce from f(i-1) and s[i]:

  • If the letter s[i] is in f(i-1), let j be the greatest position j < i such that s[j] = s[i]. Then f(i) is s[j+1 .. i] (in Python notation)
  • Otherwise, f(i) is f(i-1) with s[i] appended.

The solution to your problem is any f(i) of maximal length (not necessarily unique).

You could implement this algorithm to run in O(n * 26) time, where 26 is the number of letters in the alphabet.

Upvotes: 0

Markus Jarderot
Markus Jarderot

Reputation: 89171

string MaximumSubstringNonRepeating(string text)
{
    string max = null;
    bool isCapture = false;
    foreach (string s in Regex.Split(text, @"(.)\1+"))
    {
        if (!isCapture && (max == null || s.Length > max.Length))
        {
            max = s;
        }
        isCapture = !isCapture;
    }
    return max;
}

. matches any character. ( ) captures that character. \1 matches the captured character again. + repeats that character. The whole pattern matches two or more repetitions of any one character. "AA" or ",,,,".

Regex.Split() splits the string at every match of the pattern, and returns an array of the pieces that are in between. (One caveat: It also includes the captured substrings. In this case, the one character that are being repeated. The captures will show up in between the pieces. This is way I just added the isCapture flag.)

The function cuts out all the repeated characters, and returns the longest piece that where in between the repeated each set of repeated characters.

>>> MaximumSubstringNonRepeating("AABGAKG") // "AA" is repeated
"BGAKG"

>>> MaximumSubstringNonRepeating("AABGAKGIMNZZZD") // "AA" and "ZZZ" are repeated.
"BGAKGIMN"

Upvotes: -1

Tyler Durden
Tyler Durden

Reputation: 11532

How about this:

public static String getLongestSubstringNoRepeats( String string ){
    int iLongestSoFar = 0;
    int posLongestSoFar = 0;
    char charPrevious = 0;
    int xCharacter = 0;
    int iCurrentLength = 0;
    while( xCharacter < string.length() ){
        char charCurrent = string.charAt( xCharacter );
        iCurrentLength++;
        if( charCurrent == charPrevious ){
            if( iCurrentLength > iLongestSoFar ){
                iLongestSoFar = iCurrentLength;
                posLongestSoFar = xCharacter;
            }
            iCurrentLength = 1;
        }
        charPrevious = charCurrent;
        xCharacter++;
    }
    if( iCurrentLength > iLongestSoFar ){
        return string.substring( posLongestSoFar );
    } else {
        return string.substring( posLongestSoFar, posLongestSoFar + iLongestSoFar );
    }
}

Upvotes: 1

Related Questions