Reputation: 741
In an interview I was asked this question: given some array of positive integers s, find the length of the longest subarray such that the sum of all its values is less than or equal to some positive integer k. Each input will always have at least one solution. The array is not circular.
I began writing a dynamic programming solution that worked by finding the maximum length at increasingly larger values from 0 to k.
Here is my code in python, there is an error inside it that I couldn't seem to find, my answer is always off by a few digits:
def maxLength(s, k):
lengths = [0 for x in range(k)]
for i in range(1,k+1):
for j in range(len(s)):
if s[j] <= i and lengths[i - s[j]] + 1 > lengths[i]:
lengths[i] = lengths[i - s[j]] + 1
if i + 1 == len(s):
break
return lengths[-1]
Input1:s = [1,2,3], k = 4
Output1: 2
Input2: s=[3,1,2,1], k = 4
Output2: 3
Upvotes: 22
Views: 30352
Reputation: 1895
I just stumbled over this interesting discussion.
While reading and pondering the different approaches, I wondered why no one had proposed to return early when it is impossible to find a longer subarray.
max_len > array_len - subarray_start
The overhead is one extra integer value stored in terms of memory. It also adds one subtraction and one comparison to the loop.
Both should not weigh too much and could be beneficial for longer arrays and subarrays.
Integrating this into bigblind's solution:
def max_length(s, k):
# These two mark the start and end of the subarray that `current` used to be.
array_len = len(s)
subarray_start = 0
subarray_end = 0
subarray_sum = 0
max_len = -1 # returns -1 if there is no subsequence that adds up to k.
for i in s:
subarray_sum += i
subarray_end += 1
while subarray_sum > k: # Shrink the array from the left, until the sum is <= k.
subarray_sum -= s[subarray_start]
subarray_start += 1
# After the previous while loop, subarray_sum is guaranteed to be
# smaller than or equal to k.
max_len = max(max_len, subarray_end - subarray_start)
# Return early when it is clear no longer subarray can be found
if max_len > array_len - subarray_end: return max_len
return max_len
I tested it with the original input plus several longer lists and higher k values (max_length([1,3,1,0,1,4,2,1,4,1], 4)
for example) side by side with log output so that I was able to count the loops.
What really struck me: My addition saves maybe a loop or two. If we really massage the test input there might be cases where it saves more, but all in all it is not doing much more than adding more lines of code.
I just left this here if others happen to think of this as well.
Upvotes: 0
Reputation: 1
CODE IN CPP
#include <bits/stdc++.h>
#define ll long long
using namespace std;
void solve(vector<int>& arr, int k) {
int n = arr.size();
int subarray_start = 0;
int subarray_end = 0;
int subarray_sum = 0;
int max_len = -1;
for(int i=0;i<n;i++){
subarray_sum += arr[i];
subarray_end += 1;
while(subarray_sum>=k){
subarray_sum -= arr[subarray_start];
subarray_start ++;
}
max_len = max(max_len,subarray_end - subarray_start);
}
cout<<max_len;
}
int main()
{
int n,k;
cin >> n >> k;
vector<int> v(n);
for(int i=0;i<n;i++)
cin >> v[i];
solve(v,k);
return 0;
}
[executed code snap][output snap]
Upvotes: 0
Reputation: 107
Approaches to this problem
O(n) - Two pointer approach
Initialise first element to s and e and subArray_sum = arr[0]
Now if subArray_sum < k increment e while subArray_sum <= k
Once subArray_sum become >= k increment s till it becomes <= k
O(nlog n) - Binary Search
Consider all possible subarray lengths i.(1 <= i <= n) .Among all sub arrays of length i find the one with minimum sum.For a given value of i this can be done in O(n). Now for any sub-array of length i if the subarray of length i but with minimum sum has sum <= k,we can find i length sub arrays with sum <= k.Now to find longest i such that there exists a subarray of that length with sub array sum <= k. Do binary search over range of i with start = 1 and end = n;
O(n*n) - Brute Force
Consider all possible subarrays(n*n in number) and find longest with sum <= k
Variant of above problem
Length of longest subarray with average less than or equal to k
All above approaches applicable here also
Upvotes: 1
Reputation: 12867
You can do this in linear (O(n)) time:
def max_length(s, k):
# These two mark the start and end of the subarray that `current` used to be.
subarray_start = 0
subarray_end = 0
subarray_sum = 0
max_len = -1 # returns -1 if there is no subsequence that adds up to k.
for i in s:
subarray_sum += i
subarray_end += 1
while subarray_sum > k: # Shrink the array from the left, until the sum is <= k.
subarray_sum -= s[subarray_start]
subarray_start += 1
# After the previous while loop, subarray_sum is guaranteed to be
# smaller than or equal to k.
max_len = max(max_len, subarray_end - subarray_start)
return max_len
There was some confusion with the original question, where I thought we were looking for a subarray with sum **equal to (but not less than) k*. My original answer is below. There's information about the linearity of this solution in there as well, so read on if you're interested.
Here's how I would do it:
def max_length(s, k):
current = []
max_len = -1 # returns -1 if there is no subsequence that adds up to k.
for i in s:
current.append(i)
while sum(current) > k: # Shrink the array from the left, until the sum is <= k.
current = current[1:]
if sum(current) == k:
max_len = max(max_len, len(current))
return max_len
This exploits the fact that we're looking for a contiguous subarray to get a solution with linear (O(n)) time complexity. current
is our current attempt at creating a subarray that adds up to k
. We loop through s
and add every element from s
to current
. If the total sum of current
becomes too large (larger than k
), we remove elements from the left of current
until the sum is smaller than or equal to k
. If at any point, the sum is equal to k
, we record the length.
Well... I lied, and Francisco Couzo caught me in the comments. The code above is not really O(n) I'm calling len(current)
and sum(current)
, which take at most n
steps, making the algorithm run in quadratic time (O(n^2)). We can fix this by keeping track of the size and sum of current
ourselves.
The version below gets us closer to O(n), but I noticed an issue while writing it.
def max_length(s, k):
current = []
len_current = 0
sum_current = 0
max_len = -1 # returns -1 if there is no subsequence that adds up to k.
for i in s:
current.append(i)
sum_current += i
len_current += 1
while sum_current > k: # Shrink the array from the left, until the sum is <= k.
sum_current -= current[0]
current = current[1:]
len_current -= 1
if sum_current == k:
max_len = max(max_len, len_current)
return max_len
This piece of code might look like it's O(n), and if it were written in Go, it would be. See that current = current[1:]
? According to the TimeComplexities article in the Python wiki, taking a slice from a list takes O(n).
I couldn't find a list operation that removes an element from the start, until I suddenly realized that I didn't have to. current
would always be a contiguous subarray of s
, so why not just mark its start and end?
So here's my final solution:
def max_length(s, k):
# These two mark the start and end of the subarray that `current` used to be.
subarray_start = 0
subarray_end = 0
subarray_sum = 0
max_len = -1 # returns -1 if there is no subsequence that adds up to k.
for i in s:
subarray_sum += i
subarray_end += 1
while subarray_sum > k: # Shrink the array from the left, until the sum is <= k.
subarray_sum -= s[subarray_start]
subarray_start += 1
if subarray_sum == k:
max_len = max(max_len, subarray_end - subarray_start)
return max_len
If you consider the array to be circular, which the first example case in the question seems to indicate, you can go through the array twice:
def max_length(s, k):
s = s + s
# These two mark the start and end of the subarray that `current` used to be.
subarray_start = 0
subarray_end = 0
subarray_sum = 0
max_len = -1 # returns -1 if there is no subsequence that adds up to k.
for i in s:
subarray_sum += i
subarray_end += 1
while subarray_sum > k: # Shrink the array from the left, until the sum is <= k.
subarray_sum -= s[subarray_start]
subarray_start += 1
if subarray_sum == k:
max_len = max(max_len, subarray_end - subarray_start)
return max_len
There are probably checks you can make to break out of that second pass sooner, based on the values you've encountered in the first pass.
Upvotes: 29
Reputation: 2971
This article can help you very much.
https://e2718281828459045.wordpress.com/2013/08/19/longest-subarray-whose-sum-k/
It can be solved using
Sum Array + Binary Search.
First observation you get is that if we consider ith element then we must continue with (i+1)th and so on. That is we need to add all the elements in a order till the last element or we reach the sum. So order is important.
How can we add the numbers. There are n ways present. The first way is that we can start with first element and add till we reach k or we reach the last element. The second way is that we can start with second element and read till we reach k or reaches the last element.
So a brute force algorithm will give you a O(n2) solution. How can we improve on that? Obviously it is not the expected solution. For each i
we are calculating the sum of element and check that the total exceeds the given value of 'k' or not. To avoid this we can create a sum array.
Remember, whenever you encounter a problem with sequence sum(or the sum of continuous elements in a given array) most probably it can solve using sum array technique. Sum array is a newly constructed array using the given array. It can be generated using the following formula,
sum[i] = sum[i−1] + array[i]
for all i>0. and
sum[i]=array[i]
for i=0.
Sum array can be created in O(n) time. Finding the sum between ith and jth becomes easy. It's the difference,
sum[j]−sum[i], j>i
will give you the answer. But still it O(n2) solution.
The problem is that for each value of i
we take O(n) time to find the j
.
So how can we reduce that?
Bingo! Here comes binary search. By using modified binary search on the interval i
and n
for each i
we can find the j
in O(logn) time. So it takes only O(nlogn) time. We need an additional variable and condition for storing the length of the subarray, that is j−i
.
Hope this helps.
Upvotes: 0
Reputation: 10957
Initially the question was to find the length of the longest subarray that would sum to k.
You can run through the list indices, take each index as starting point of a window over which you sum. You then run over the indices from your starting index to the end to mark the end of the window. At each step you take the sum, or even better, add it to a sum term. If the sum exceeds the target you break out of the inner loop, moving on.
It will look like this:
def get_longes(a_list, k):
longest = 0
length = len(a_list)
for i in xrange(length):
s = 0
for j in xrange(i,length):
s+=a_list[j]
if s < k:
pass
elif s==k:
longest = j+1-i
else:
break
return longest
This can be further speed up as you don't need to reset the window size when you move one step in the outer loop. In fact you just need to keep track of the window size and decrease it by 1 if the outer loop moves on. In this way you can even get rid of the inner loop and write something in O(n):
def get_longest(a_list,k):
length=len(a_list)
l_length = 0
longest = 0
s = 0
for i in xrange(length):
while s<k: # while the sum is smaller, we increase the window size
if i+l_length==length: # this could also be handled with a try, except IndexError on the s+=a_list[... line
return longest
s+=a_list[i+l_length]
l_length+=1
if s == k: # if the sum is right, keep its length if it is bigger than the last match.
longest = max(l_length, longest)
l_length-=1 # keep the window end at the same position (as we will move forward one step)
s-=a_list[i] # and subtract the element that will leave the window
return longest
The updated question asks for the longest subarray for which the sum is equal or less than k.
For this question, the basic approach is the same and actually the solution becomes even simpler as now we only have two conditions on the sum, that is:
1) the sum is smaller or equal to k.
2) the sum is bigger than k.
The solution looks like:
def get_longest_smaller_or_equal(a_list,k):
length=len(a_list)
l_length = 0
longest = 0
s = 0
for i in xrange(length):
while s<=k: # while the sum is smaller, we increase the window size
longest = max(l_length, longest)
if i+l_length==length: # this could also be handled with a try, except IndexError on the s+=a_list[... line
return longest
s+=a_list[i+l_length]
l_length+=1
l_length-=1 # keep the window end at the same position (as we will move forward one step)
s-=a_list[i] # and subtract the element that will leave the window
return longest
Upvotes: 8
Reputation: 104682
Here's a solution that works for any iterable s
(even an iterator). It's essentially the same algorithm as bigblind's answer, but it will be more efficient if k
is large relative to the values in s
(so that the lengths of relevant subsequences are long):
import itertools
def max_length(s, k):
head, tail = itertools.tee(s)
current_length = current_sum = 0
max_len = -1 # returns -1 if there is no subsequence that adds up to k.
for i in head:
current_length += 1
current_sum += i
while current_sum > k:
current_length -= 1
current_sum -= next(tail)
if current_sum == k:
max_len = max(max_len, current_sum)
return max_len
Because we don't keep a list with the subsequence we're examining as we iterate, this iterator-based approach is only useful if you only need the length of the longest subsequence, not its actual content.
If you want to get a copy of the longest subsequence, you could use a different variation on bigblind's answer, using a collections.dequeue
instead of a list (so that popping from the left is fast) and keeping track of the running sum like my code does (so you don't need to call sum
over and over):
import collections
def max_subsequence(s, k):
current = collections.dequeue()
current_sum = 0
max_len = -1
max_seq = None # returns None if there is no valid subsequence.
for i in s:
current.append(i)
current_sum += i
while current_sum > k: # Shrink from the left efficiently!
current_sum -= current.popleft()
if current_sum == k:
if len(current) > max_len:
max_len = len_current
max_seq = list(current) # save a copy of the subsequence
return max_seq
If your question title is misleading and you don't actually care if the subsequence is contiguous, then I think your current dynamic programming approach can do what you want. I'm just not entirely sure I understand how your loops were intended to work. I think it is most natural with an outer loop over the input items, and a second loop over potential sums that include that value (which are indexes into the lengths
list). I'd also suggest using None
as the initial value for all lengths other than 0
, since otherwise its hard to make the conditions work correctly without special cases.
def max_length(s, k):
lengths = [None for _ in range(k+1)]
lengths[0] = 0
for x in s:
for i in range(k, x-1, -1): # count down to avoid duplication
if lengths[i-x] is not None and (lengths[i] is None or
lengths[i-x] >= lengths[i]):
lengths[i] = lengths[i-x] + 1
return lengths[k]
Upvotes: 2
Reputation: 18418
I think this works... (recursively and taking out the contiguous requirement from the question, since that doesn't seem to match the sample outputs provided in the question) and the OP mentions that the question was:
given some array of positive integers s, find the length of the longest subarray such that the sum of all values is equal to some positive integer k.
def longest_sum(input_list, index, num_used, target_number):
if target_number == 0:
return num_used
if index >= len(input_list):
return 0
# Taken
used_1 = longest_sum(input_list, index + 1, num_used + 1, target_number - input_list[index])
# Not taken
used_2 = longest_sum(input_list, index + 1, num_used, target_number)
return max(used_1, used_2)
if __name__ == "__main__":
print(longest_sum([2, 1, 8, 3, 4], 0, 0, 6))
print(longest_sum([1, 2, 3], 0, 0, 4))
print(longest_sum([3, 1, 2, 1], 0, 0, 4))
print(longest_sum([1, 2, 7, 8, 11, 12, 14, 15], 0, 0, 10))
print(longest_sum([1, 2, 3], 0, 0, 999))
print(longest_sum([1, 1, 1, 1, 1, 1, 4], 0, 0, 6))
Outputs:
3
# BorrajaX's note: 2 + 1 + 3
2
# BorrajaX's note: 3 + 1
3
# BorrajaX's note: 1 + 2 + 1
3
# BorrajaX's note: 1 + 2 + 7
0
# BorrajaX's note: No possible sum
6
# BorrajaX's note: 1 + 1 + 1 + 1 + 1 + 1
EDIT 01:
If you wanted to fetch which is the list that gives you the longest sum, you could always do it like this:
import copy
def longest_sum(input_list, used_list, target_number):
if target_number == 0:
return used_list
if not input_list:
return []
# Taken
used_list_taken = copy.copy(used_list)
used_list_taken.append(input_list[0])
used_1 = longest_sum(input_list[1:], used_list_taken, target_number - input_list[0])
# Not taken
used_list_not_taken = copy.copy(used_list)
used_2 = longest_sum(input_list[1:], used_list_not_taken, target_number)
if len(used_1) > len(used_2):
return used_1
else:
return used_2
if __name__ == "__main__":
print(longest_sum([2, 1, 8, 3, 4], [], 6))
print(longest_sum([1, 2, 3], [], 4))
print(longest_sum([3, 1, 2, 1], [], 4))
print(longest_sum([1, 2, 7, 8, 11, 12, 14, 15], [], 10))
print(longest_sum([1, 2, 3], [], 999))
print(longest_sum([1, 1, 1, 1, 1, 1, 4], [], 6))
You'd see:
[2, 1, 3]
[1, 3]
[1, 2, 1]
[1, 2, 7]
[]
[1, 1, 1, 1, 1, 1]
PS 1: I really don't know how to do this without the quick backtracking capabilities that recursion provides... Sorry :-(
PS 2: If this is not what you wanted (I mentioned that I took out the contiguous requirement from the requirements) let me know, and I'll remove this answer.
Upvotes: 3
Reputation: 1376
Bit shorter and assuming non circular s: Slide windows of decreasing size over s.
def maxlen(s, k):
for win in range(k, 0, -1):
for i in range(0, len(s) - win):
if sum(s[i:i+win]) == k:
return win
return None
s = [3,1,2,3]
k = 4
print(maxlen(s, k))
Upvotes: 2