Yash
Yash

Reputation: 41

Find duplicate in an array without changing inputs?

Given a constant array with integers in the range of [1...N] and the length of array is N+1. The array contains atleast one duplicate and we need to find that duplicate in constant space and linear time,without destroying the input. I was stuck on it and then i finally saw the solution of using Floyd-cycle finding algorithm,how to prove that it works?

Upvotes: 3

Views: 1411

Answers (2)

Matt Timmermans
Matt Timmermans

Reputation: 59184

Consider the array slots, numbered 1 to N+1.

Each slot has a number in it, which we can consider to be the pointer its 'successor' slot.

Start at slot N+1 and follow the pointers. Every slot has a pointer, so we can follow them forever, but there are only N slots, so at some point we will see a pointer that we've seen before, and that gets us into a cycle.

The first pointer we see twice is a duplicated number. The proof is as follows:

Given any slot x, we can say that x is "outside the loop" if x is visited only once, and "inside the loop" if we visit it multiple times.

If a slot x is outside the loop, then:

  • If the pointer in slot x is never seen again, then the successor of slot x is also outside of the loop;
  • If the pointer in slot x is seen again, then the successor of x will be visited again, and it is inside the loop.

Obviously, slot N+1 is outside the loop, because the pointers only range from 1 to N.

Since we start on a slot that is outside the loop, and we must eventually visit slots inside the loop, there is a slot inside the loop that has a predecessor outside the loop. Since it will be visited again, however, and nodes outside the loop are only seen once, it must also have a predecessor that is inside the loop, because that is the only way we can return to it.

That slot's number therefore appears in at least two other slots -- its predecessors -- and is a duplicate element in the array.

Upvotes: 3

kaya3
kaya3

Reputation: 51053

If we interpret "without changing inputs" to mean that the array must be left in the same state it started in, then we can cheat a little bit: the numbers are in the range 1..N, so we can store extra information using numbers outside this range, so long as we put the original numbers back afterwards. A simple scheme is to flip x to -x to store an extra N+1 bits of information without allocating O(N) additional memory. The original state can be restored after the answer is found, in linear time, by flipping all the negative values back.

The solution works by flipping the value at index i when the value i is in the array. If it's already flipped, then i is a duplicate. Here's an example in Python:

def find_duplicate(arr):
    result = -1
    for i in arr:
        if arr[i] < 0:
            result = i
            break
        arr[i] *= -1

    # restore original state
    for i in range(len(arr)):
        if arr[i] < 0:
            arr[i] *= -1

    return result

Upvotes: 1

Related Questions