This problem is taken directly from Cracking the Coding Interview, 4th Ed , so I'm not 100% sure I can post it here; if not, just let me know and I'll delete this. Question 5.7 : An array A[1..n] contains all the integers from 0 to n except for one number which is missing. In this problem, we cannot access an entire integer in A with a single opera-tion. The elements of A are represented in binary, and the only operation we can use to access them is “fetch the jth bit of A[i]”, which takes constant time. Write code to find the missing integer. Can you do it in O(n) time? I know how to solve this. What I don't understand, is how she solved it. Her method: Start with least sig bit. Count occurrence of 1's vs 0's. If count(1) < count(0) => the missing number has a 1 as it's least sig bit, else it has a 0. Remove all numbers with least sig bit not matching result found in step 3. Repeat steps 1->4, iterating from least sig bit -> 2nd least sig bit -> ... -> most sig bit I can see this working when n is of the form (2^m)-1 for some positive m ... but don't see how it would work in the general case. Consider the natural numbers in binary radix. Then the sequence of the ith-least sig bit goes like: 0,1,0,1,0,1,0,... = {01}* = {(1)0(1)1}* 0,0,1,1,0,0,1,1,... = {0011}* = {(2)0(2)1}* 0,0,0,1,1,1,0,0,0,... = {000111}* = {(3)0(3)1}* Then the most sig bit has some sequence {(s)0(s)1} for some s . If n=(2^m)-1 , then all is well; for each magnitude, the #1s = #0's and thus we can use the authors logic. But , how does this work in the general case? If n is some arbitrary number, the sequence of most-sig bits leading up to n looks like: (s)0(s)1(s)0(s)1...(k)1 (clearly the sequence must end with 1 as most sig bit), and k could be any number in [0,s]. So how does her logic apply? (Most notably, step 3 assumes that #0's is at most 1 more than #1's in the usual case) Thanks if anyone could shed some light on this! Thanks!

There are 3 possibilities: n is odd, so the number of 0 bits and 1 bits should be the same. They won't be, so the lower number is obviously the missing one. n is even, so the number of 0 bits should be 1 greater than the number of 1 bits. If they're equal, it's the 0 bit that's missing. As above, n is even. If the number of 0 bits is 2 greater than the number of 1 bits, the 1 bit is the missing one. By removing all the numbers that have been eliminated, you're now applying the exact same problem again, recursively.

algorithm

zac

Reputation: 857

Find missing number in bit array

This problem is taken directly from Cracking the Coding Interview, 4th Ed, so I'm not 100% sure I can post it here; if not, just let me know and I'll delete this.

Question 5.7:

An array A[1..n] contains all the integers from 0 to n except for one number which is missing. In this problem, we cannot access an entire integer in A with a single opera-tion. The elements of A are represented in binary, and the only operation we can use to access them is “fetch the jth bit of A[i]”, which takes constant time. Write code to find the missing integer. Can you do it in O(n) time?

I know how to solve this. What I don't understand, is how she solved it. Her method:

Start with least sig bit.
Count occurrence of 1's vs 0's.
If count(1) < count(0) => the missing number has a 1 as it's least sig bit, else it has a 0.
Remove all numbers with least sig bit not matching result found in step 3.
Repeat steps 1->4, iterating from least sig bit -> 2nd least sig bit -> ... -> most sig bit

I can see this working when n is of the form (2^m)-1 for some positive m... but don't see how it would work in the general case.

Consider the natural numbers in binary radix. Then the sequence of the ith-least sig bit goes like:

0,1,0,1,0,1,0,... = {01}* = {(1)0(1)1}*
0,0,1,1,0,0,1,1,... = {0011}* = {(2)0(2)1}*
0,0,0,1,1,1,0,0,0,... = {000111}* = {(3)0(3)1}*

Then the most sig bit has some sequence {(s)0(s)1} for some s. If n=(2^m)-1, then all is well; for each magnitude, the #1s = #0's and thus we can use the authors logic. But, how does this work in the general case? If n is some arbitrary number, the sequence of most-sig bits leading up to n looks like: (s)0(s)1(s)0(s)1...(k)1 (clearly the sequence must end with 1 as most sig bit), and k could be any number in [0,s]. So how does her logic apply? (Most notably, step 3 assumes that #0's is at most 1 more than #1's in the usual case)

Thanks if anyone could shed some light on this! Thanks!

Upvotes: 17

Answers (3)

Jeff Li

Reputation: 31

I ran into this problem as an assignment. Although I know the instructor might be looking for something more advanced, I think the simplest approach still works here: even if you can only access the integer bit by bit, you can still calculate their sum in O(n) time. To be precise, just read the 32 * n bits and do the math. Then you compare with n*(n+1)/2 and then you know the answer.

One thing I want to mention here is,

this is actually not an O(n) algorithm, or otherwise called linear. It is actually a pseudo-linear because you need n * W operations, where W is the average length of all the integers (do not always assume they have 32 bits unless the problem explicitly says so). W is independent of n, so it cannot be fully justified to be a linear algorithm.

Upvotes: 3

Santosh Sarangi

Reputation: 21

Changing the earlier mentioned xor solution. bit0 = XOR all bit0 of numbers present is array and XOR of all bit0 of 0 to N simillarly find bit1 to bit31. Then result is bit0 | (bit1 << 1) ... (bit31 << 31) I have written a test code for numbers which can be written using max 3 bits. Looks to me working. This code can be extended for 32bit int. Let me know if there is any mistake in this approach. Thanks Poul for finding mistake in earlier soln.

#include <stdio.h>

int main()
{

    unsigned int a[7] = {0,6,2,3,4,5,1};
    unsigned int bit0, bit1, bit2, missing,i;
    bit0 = getbit(a[0], 0);
    bit1 = getbit(a[0], 1);
    bit2 = getbit(a[0], 2);
    for (i=1 ; i <sizeof(a)/sizeof(unsigned int); i++)
     {
          bit0 ^= getbit(a[i],0) ;
          bit1 ^= getbit(a[i],1);
          bit2 ^= getbit(a[i],2);
     }
     for(i = 0; i <= sizeof(a)/sizeof(unsigned int); i++)
     {
          bit0 ^= getbit(i,0);
          bit1 ^= getbit(i,1);
          bit2 ^= getbit(i,2); 
     }

     missing  = bit0 | (bit1<<1) | bit2 << 2;
     printf("%u\n",missing);
    return 0;
}
int getbit(unsigned int x, unsigned int pos)
{
   return ( (x & (1 << pos)) >> pos) ;
}

Upvotes: -1

Mark Ransom

Reputation: 308462

There are 3 possibilities:

n is odd, so the number of 0 bits and 1 bits should be the same. They won't be, so the lower number is obviously the missing one.
n is even, so the number of 0 bits should be 1 greater than the number of 1 bits. If they're equal, it's the 0 bit that's missing.
As above, n is even. If the number of 0 bits is 2 greater than the number of 1 bits, the 1 bit is the missing one.

By removing all the numbers that have been eliminated, you're now applying the exact same problem again, recursively.

Upvotes: 10

Find missing number in bit array

Answers (3)

Related Questions