Vivek Mishra
Vivek Mishra

Reputation: 5705

Find the largest substring which contains equal number of 0s and 1s in a binary string

Recently in an interview I was asked to write a program to find the largest sub string which contains equal number of 0s and 1s in a binary string.

For example:

If the given string is 1010111 then the output will be 1010 as it contains 2 0s and 2 1s.

I tried a lot but can't come up with anyway to build an algorithm for this thing.

Can someone give me head start on how to proceed or what data structure to use for this problem ?

Upvotes: 4

Views: 4992

Answers (3)

tobias_k
tobias_k

Reputation: 82899

The following will work in O(n) time and space, n being the number of characters in the string.

  • keep track of the current balance (or imbalance) of 1 and 0 you've seen in the string, and the first time the string had the same balance (an array or map with at most n entries)
  • iterate the string, and for each character...
    • update the balance, counting "1" as 1 and "0" as -1 or vice versa
    • check when, if at all, you encountered the same balance before
    • if the difference is greater than the current best, remember the new longest substring
    • if you haven't encountered that balance yet, remember it's current first position

Example code in Python:

string = "1010111000"
first = {0: 0}  # map or array; 0th element is 0
balance = 0     # initially, 1 and 0 are balanced
best = ""       # best substring found
for i, c in enumerate(string):             # (i, c) = (index, character)
    balance += 1 if c == "1" else -1       # update balance
    if balance not in first:               # first time we see this balance?
        first[balance] = i+1               # add next(!) position to map/array
    elif i - first[balance] > len(best):   # otherwise, if new longest substring
        best = string[first[balance]:i+1]  # update best with slice of string
    print(i, c, balance, best, first)      # debugging/demo output

Output:

0 1 1  {0: 0, 1: 1}
1 0 0 10 {0: 0, 1: 1}
2 1 1 10 {0: 0, 1: 1}
3 0 0 1010 {0: 0, 1: 1}
4 1 1 1010 {0: 0, 1: 1}
5 1 2 1010 {0: 0, 1: 1, 2: 6}
6 1 3 1010 {0: 0, 1: 1, 2: 6, 3: 7}
7 0 2 1010 {0: 0, 1: 1, 2: 6, 3: 7}
8 0 1 01011100 {0: 0, 1: 1, 2: 6, 3: 7}
9 0 0 1010111000 {0: 0, 1: 1, 2: 6, 3: 7}

Upvotes: 10

Cid
Cid

Reputation: 15247

This solution isn't the best one in terms of optimisation, but in the hurry and the stress of an interview, this one can be quickly thought, drawn and explained.

I would nest 2 loops.

1 starting from 0 to len - 2 (incrementing) (the minimal length should be 2)

1 starting from len to previous loop value + 2 (decrementing) (the minimal length should be 2)

Get the substring of the corresponding iterators of the loops

Count if the characters are equal.

then, if true, compare against stored result length, if the length is greater, overwrite the result.

Using 0100 as example, that will test against those values :

// result = ''
0100 //not balanced
010  //not balanced
01   //balanced AND length is greated than result's one. result = '01'
 100 //not balanced
 10  //balanced BUT length is not greated than result's one
  00 //not balanced

JavaScript example (I tweaked it a bit to optimise the number of iterations, but the approach is the same) :

var iterations = 0;

function IsBalanced(input, char1, char2)
{
    if (input.length % 2 != 0) // odd length can't be balanced
    {
      ++iterations;
      return (false);
    }
    let char1count = 0;
    let char2count = 0;
    let len = input.length;
    for (let i = 0; i < len; ++i)
    {
        ++iterations;
        if (input[i] == char1)
            ++char1count;
        else
            ++char2count;
    }
    return (char1count == char2count);
}

function findLargest(input, char1, char2)
{
    let len = input.length;
    let result = '';
    for (let k = 0; k < len - 1; ++k)
    {
        //        This is a tweak to reduce the number of iterations
        //  To avoid testing a substring smaller than the current result
        //                           |
        //                           |
        //                v----------------------v
        for (let l = len; l - k > result.length && l > k + 1; --l)
        {
            tempResult = input.substring(k, l);
            if (IsBalanced(tempResult, char1, char2) && tempResult.length > result.length)
                result = tempResult;
        }
    }
    return (result);
}

console.log("Input : 1010111 - result : " + findLargest("1010111", "1", "0") + " original size : " + "1010111".length + " - iterations : " + iterations);
iterations = 0;
console.log("Input : ababaaa - result : " + findLargest("ababaaa", "a", "b") + " original size : " + "ababaaa".length + " - iterations : " + iterations);
iterations = 0;
console.log("Input : 00100100 - result : " + findLargest("00100100", "1", "0") + " original size : " + "00100100".length + " - iterations : " + iterations);
iterations = 0;
console.log("Input : 1111100000 - result : " + findLargest("1111100000", "1", "0") + " original size : " + "1111100000".length + " - iterations : " + iterations);
iterations = 0;
console.log("Input : 0001111111111010001111100000000001111111111 - result : " + findLargest("0001111111111010001111100000000001111111111", "1", "0") + " original size : " + "0001111111111010001111100000000001111111111".length + " - iterations : " + iterations);
iterations = 0;
console.log("Input : 0000000000000000000000000000000000000000000000000001 - result : " + findLargest("0000000000000000000000000000000000000000000000000001", "1", "0") + " original size : " + "0000000000000000000000000000000000000000000000000001".length + " - iterations : " + iterations);

Upvotes: 1

Samer Tufail
Samer Tufail

Reputation: 1894

I would approach it like this

  • Initialise a variable integer sum and maxlength = minimum

  • Define a hashmap with with sum as key and index as value

  • For each value in the given string

    • sum += arr[i] == 0 ? then add -1 otherwise add a 1

    • if the sum is 0 maxlength = maxlength or index + 1 since this is a potential answer

    • else if dictionary contains that sum value, maxlength = maxlength or (i -index hash[sum]) the sum value found earlier contributing to the result.

    • update the hashmap if the value of sum is not in the hash map with index

    • return the maximumlength.

Here is an example for what I mentioned above, in code : working example you can try changing the test case to see how this would work for various test cases, try printing the hash map as well and trace it by hand to gain a deeper understanding.

Upvotes: 3

Related Questions