Lin Ma
Lin Ma

Reputation: 10159

2 sum algorithm correctness of implementation

The traditional two sum problem (my below reference implementation) is, for a sorted array, find all pairs of numbers which sum value is equal to a given target value. Traditional implementation scans from two ends of array, increase low index pointer if current sum is smaller than target value, decrease high index pointer if current sum is larger than target value.

My question is, is there a proof to show the correctness of this algorithm, showing that it does not miss a pair?

BTW, if my code has issues, please feel free to point out.

def twoSum(numbers, skipIndex, targetValue):
    i = 0
    j = len(numbers) - 1
    result = []
    while i < j:
        if i in skipIndex:
            i+=1
        if j in skipIndex:
            j+=1
        if numbers[i] + numbers[j] == targetValue:
            result.append((numbers[i], numbers[j]))
            i += 1
            j -= 1
        elif numbers[i] + numbers[j] > targetValue:
            j -= 1
        else:
            i += 1

    return result

if __name__ == "__main__":
    numbers = [1,2,4,5,6,7,9,10]
    print twoSum(numbers, [1], 11) # output, [(1, 10), (4, 7), (5, 6)]
    print twoSum(numbers, [], 11) # output, [(1, 10), (2, 9), (4, 7), (5, 6)]

Upvotes: 4

Views: 1509

Answers (1)

Linus Arver
Linus Arver

Reputation: 1378

Let's just simplify the problem to find the first such pair, not all pairs (extending it to all pairs that add up to targetValue is straightforward).

Whenever you increase the left pointer i, you are throwing out the corresponding element from further consideration because when added with any other element in the array, the sum is still too small. So it is unusable. A similar argument follows for decrementing the right pointer j, with the tweak that the sum is too big rather than too small.

Another way to look at it is if you consider that modifying the pointers is the same thing as modifing the (sorted) list, because effectively it is the same thing as deleting the smallest and largest elements.

Lastly if the pointers ever cross, this is the same thing as the list becoming too small (<2 elements) to continue searching, and you are done.

What I described above is for the case of searching for the first pair that meets targetValue. To find all such pairs, you would simply continue to search as long as the pointers don't cross (instead of returning at the moment targetValue is found from the current pairwise sum). You've already done this with your while i < j condition (although for me personally i != j is more legible).

I am not sure what the utility of skipIndex is in your implementation (debug code?) but j += 1 feels suspiciously buggy to me.

For further reading, I wrote a blog post: https://funloop.org/post/2020-12-05-twosum-problem-explained.html. I am not good at math so it is a non-mathy explanation.

Upvotes: 1

Related Questions