Reputation: 5705
Recently in an interview I was asked to write a program to find the largest sub string which contains equal number of 0
s and 1
s in a binary string.
For example:
If the given string is 1010111
then the output will be 1010
as it contains 2 0
s and 2 1
s.
I tried a lot but can't come up with anyway to build an algorithm for this thing.
Can someone give me head start on how to proceed or what data structure to use for this problem ?
Upvotes: 4
Views: 4992
Reputation: 82899
The following will work in O(n) time and space, n being the number of characters in the string.
balance
(or imbalance) of 1
and 0
you've seen in the string
, and the first
time the string had the same balance (an array or map with at most n
entries)string
, and for each character...
balance
, counting "1"
as 1
and "0"
as -1
or vice versabalance
beforebest
, remember the new longest substringfirst
positionExample code in Python:
string = "1010111000"
first = {0: 0} # map or array; 0th element is 0
balance = 0 # initially, 1 and 0 are balanced
best = "" # best substring found
for i, c in enumerate(string): # (i, c) = (index, character)
balance += 1 if c == "1" else -1 # update balance
if balance not in first: # first time we see this balance?
first[balance] = i+1 # add next(!) position to map/array
elif i - first[balance] > len(best): # otherwise, if new longest substring
best = string[first[balance]:i+1] # update best with slice of string
print(i, c, balance, best, first) # debugging/demo output
Output:
0 1 1 {0: 0, 1: 1}
1 0 0 10 {0: 0, 1: 1}
2 1 1 10 {0: 0, 1: 1}
3 0 0 1010 {0: 0, 1: 1}
4 1 1 1010 {0: 0, 1: 1}
5 1 2 1010 {0: 0, 1: 1, 2: 6}
6 1 3 1010 {0: 0, 1: 1, 2: 6, 3: 7}
7 0 2 1010 {0: 0, 1: 1, 2: 6, 3: 7}
8 0 1 01011100 {0: 0, 1: 1, 2: 6, 3: 7}
9 0 0 1010111000 {0: 0, 1: 1, 2: 6, 3: 7}
Upvotes: 10
Reputation: 15247
This solution isn't the best one in terms of optimisation, but in the hurry and the stress of an interview, this one can be quickly thought, drawn and explained.
I would nest 2 loops.
1 starting from 0 to len - 2 (incrementing) (the minimal length should be 2)
1 starting from len to previous loop value + 2 (decrementing) (the minimal length should be 2)
Get the substring of the corresponding iterators of the loops
Count if the characters are equal.
then, if true, compare against stored result length, if the length is greater, overwrite the result.
Using 0100
as example, that will test against those values :
// result = ''
0100 //not balanced
010 //not balanced
01 //balanced AND length is greated than result's one. result = '01'
100 //not balanced
10 //balanced BUT length is not greated than result's one
00 //not balanced
JavaScript example (I tweaked it a bit to optimise the number of iterations, but the approach is the same) :
var iterations = 0;
function IsBalanced(input, char1, char2)
{
if (input.length % 2 != 0) // odd length can't be balanced
{
++iterations;
return (false);
}
let char1count = 0;
let char2count = 0;
let len = input.length;
for (let i = 0; i < len; ++i)
{
++iterations;
if (input[i] == char1)
++char1count;
else
++char2count;
}
return (char1count == char2count);
}
function findLargest(input, char1, char2)
{
let len = input.length;
let result = '';
for (let k = 0; k < len - 1; ++k)
{
// This is a tweak to reduce the number of iterations
// To avoid testing a substring smaller than the current result
// |
// |
// v----------------------v
for (let l = len; l - k > result.length && l > k + 1; --l)
{
tempResult = input.substring(k, l);
if (IsBalanced(tempResult, char1, char2) && tempResult.length > result.length)
result = tempResult;
}
}
return (result);
}
console.log("Input : 1010111 - result : " + findLargest("1010111", "1", "0") + " original size : " + "1010111".length + " - iterations : " + iterations);
iterations = 0;
console.log("Input : ababaaa - result : " + findLargest("ababaaa", "a", "b") + " original size : " + "ababaaa".length + " - iterations : " + iterations);
iterations = 0;
console.log("Input : 00100100 - result : " + findLargest("00100100", "1", "0") + " original size : " + "00100100".length + " - iterations : " + iterations);
iterations = 0;
console.log("Input : 1111100000 - result : " + findLargest("1111100000", "1", "0") + " original size : " + "1111100000".length + " - iterations : " + iterations);
iterations = 0;
console.log("Input : 0001111111111010001111100000000001111111111 - result : " + findLargest("0001111111111010001111100000000001111111111", "1", "0") + " original size : " + "0001111111111010001111100000000001111111111".length + " - iterations : " + iterations);
iterations = 0;
console.log("Input : 0000000000000000000000000000000000000000000000000001 - result : " + findLargest("0000000000000000000000000000000000000000000000000001", "1", "0") + " original size : " + "0000000000000000000000000000000000000000000000000001".length + " - iterations : " + iterations);
Upvotes: 1
Reputation: 1894
I would approach it like this
Initialise a variable integer sum and maxlength = minimum
Define a hashmap with with sum as key and index as value
For each value in the given string
sum += arr[i] == 0 ? then add -1 otherwise add a 1
if the sum is 0 maxlength = maxlength or index + 1 since this is a potential answer
else if dictionary contains that sum value, maxlength = maxlength or (i -index hash[sum]) the sum value found earlier contributing to the result.
update the hashmap if the value of sum is not in the hash map with index
return the maximumlength.
Here is an example for what I mentioned above, in code : working example you can try changing the test case to see how this would work for various test cases, try printing the hash map as well and trace it by hand to gain a deeper understanding.
Upvotes: 3