MRT89
MRT89

Reputation: 319

Using a hash to find one duplicated and one missing number in an array

I had this question during an interview and am curious to see how it would be implemented.

Given an unsorted array of integers from 0 to x. One number is missing and one is duplicated. Find those numbers.

Here is what I came up with:

int counts[x+1];

for(int i =0;i<=x; i++){
    counts[a[i]]++;
    if(counts[a[i]] == 2)
        cout<<”Duplicate element: “<<a[i];  //I realized I could find this here
}

for(int j=0; j<=x; j++){
    if(counts[j] == 0)
        cout<<”Missing element: “<<j; 
    //if(counts[j] == 2)
    //  cout<<”Duplicate element: “<<j;   //No longer needed here.
}

My initial solution was to create another array of size x+1, loop through the given array and index into my array at the values of the given array and increment. If after the increment any value in my array is two, that is the duplicate. However, I then had to loop through my array again to find any value that was 0 for the missing number.

I pointed out that this might not be the most time efficient solution, but wasn't sure how to speed it up when I was asked. I realized I could move finding the duplicate into the first loop, but that didn't help with the missing number. After waffling for a bit, the interviewer finally gave me the idea that a hash would be a better/faster solution. I have not worked with hashes much, so I wasn't sure how to implement that. Can someone enlighten me? Also, feel free to point out any other glaring errors in my code... Thanks in advance!

Upvotes: 2

Views: 2034

Answers (5)

timmz
timmz

Reputation: 2252

for (i=0 to length) { // first loop
 for( j=0 to length ){ // second loop
      if (t[i]==j+1) {
           if (counter==0){//make sure duplicated number has not been found already
               for(  k=i+1 to length ) { //search for duplicated number
                   if(t[k]==j+1){
                        j+1 is the duplicated number ;
                        if(missingIsFound)
                               exit // exit program, missing and dup are found 
                        counter=1 ;
                        }//end if t[k]..
                 }//end loop for duplicated number
                 } // end condition to search
          continue ; // continue to first loop   
      }
 else{
       j+1 is the missing number ;
       if(duplicatedIsFound)
                  exit // exit program, missing and dup are found 
       continue ; //continue to first loop

 }//end second loop
} //end first loop

Upvotes: 0

Jens
Jens

Reputation: 9416

Sorting allowed?

auto first = std::begin(a);
auto last = std::end(a);

// sort it
std::sort( first, last );

// find duplicates
auto first_duplicate = *std::adjacent_find( first, last );

// find missing value
auto missing = std::adjacent_find(first, last, [](int x, int y) {return x+2 == y;});
int missing_number = 0;
if (missing != last)
{
    missing_number = 1+ *missing;
}
else
{
    if (counts[0] != 0)
    {
        missing_number = 0;
    }
    else
    {
        missing_number = 9;
    }
}

Both could be done in a single hand-written loop, but I wanted to use only stl algorithms. Any better idea for handling the corner cases?

Upvotes: 0

rcgldr
rcgldr

Reputation: 28921

If the range of values is the about the same or smaller than the number of values in an array, then using a hash table will not help. In this case, there are x+1 possible values in an array of size x+1 (one missing, one duplicate), so a hash table isn't needed, just a histogram which you've already coded.

If the assignment were changed to be looking for duplicate 32 bit values in an array of size 1 million, then the second array (a histogram) could need to be 2^32 = 4 billion counts long. This is when a hash table would help, since the hash table size is a function of the array size, not the range of values. A hash table of size 1.5 to 2 million would be large enough. In this case, you would have 2^32 - 2^20 = 4293918720 "missing" values, so that part of the assignment would go away.

Wiki article on hash tables:

Hash Table

Upvotes: 3

Astockwell
Astockwell

Reputation: 1536

Here is a stab at a solution that uses an index (a true key-value hash doesn't make sense when the array is guaranteed to include only integers). Sorry OP, it's in Ruby:

values = mystery_array.sort.map.with_index { |n,i| n if n != i }.compact

missing_value,duplicate_value = mystery_array.include?(values[0] - 1) ? \
    [values[-1] + 1, values[0]] : [values[0] - 1, values[-1]]

The functions used likely employ a non-trivial amount of looping behind the scenes, and this will create a (possibly very large) variable values which contains a range between the missing and/or duplicate value, as well as a second lookup loop, but it works.

Perhaps the interviewer meant to say Set instead of hash?

Upvotes: 0

Scott Hunter
Scott Hunter

Reputation: 49883

If x were small enough (such that the sum of 0..x can be represented), you could compute the sum of the unique values in a, and subtract that from the sum of 0..x, to get the missing value, without needing the second loop.

Upvotes: 2

Related Questions