Reputation: 171
Hello all I have a problem with understanding the bitsort program in Bentley's classic programming pearls. I am new to Bitmask and Bitset, so I am not able to understanding these concepts. Actually the program is for "How to sort a disk file?".So below is the code
#include <stdio.h>
#define BITSPERWORD 32
#define SHIFT 5
#define MASK 0x1F
#define N 10000000
int a[1 + N/BITSPERWORD];
void set(int i) {
a[i>>SHIFT] |= (1<<(i & MASK));
}
void clr(int i) {
a[i>>SHIFT] &= ~(1<<(i & MASK));
}
int test(int i){
return a[i>>SHIFT] & (1<<(i & MASK));
}
int main()
{ int i;
for (i = 0; i < N; i++)
clr(i);
/* Replace above 2 lines with below 3 for word-parallel init
int top = 1 + N/BITSPERWORD;
for (i = 0; i < top; i++)
a[i] = 0;
*/
while (scanf("%d", &i) != EOF)
set(i);
for (i = 0; i < N; i++)
if (test(i))
printf("%d\n", i);
return 0;
}
Could some one please explain this code to me? If possible please provide a version in Java. Actually, I am comfortable in Java only so that why I am asking. This is not homework.
Possible duplicates: link1 link2
Upvotes: 0
Views: 1637
Reputation: 33545
We're given the numbers are going to lie between 0 to N
,
So we make a large BitSet
, which is essential a large boolean array (working explained at the end) but takes less memory (each bit is technically a boolean)
So what Jon does, he sets the entire bit set to false
,then for each number encountered, he sets that Bit to true....finally, he runs through the bitset and for each true
entry he prints the index. This will sort an array where we know an element always lies between 0 to N
.
Note : The above algorithm will fail with duplicates.
Now for the Bit Mask stuff...
Say i have a integer array (sizeof(int) = 32) But i want to use it like a boolean array of N size. So how many elements do i really need? It's N/32
int a[1 + N/BITSPERWORD]; // allocates BitSet of N size
Now if I want to access ith
element of the BitSet here's how the indexing works.
Eg, i = 49
so a[0]
contains bits 0-31, a[1] contains 32-63.
a[i/32]
(gives you which int array element contains the bit)
and i % 32
the bit position within that element.
so for i= 49
, a[49/32] & ( 1 << (i % 32) )
will tell you if the 49th bit is set or not.
If you are familiar with bitwise optimizations, you know that dividing by factors of 2, is essentially, shifting the number right by the number of factors.
32 = 2^5...therefore X/32
same as X >> 5
also X % 32
is same as X & 0x1f
test
function gives out 1 if the bitset at the position is set,
clr
clears the bitset at that position to zero
set
sets the bitset at the position to one.
Phew! Hope that helps.
Upvotes: 3
Reputation: 1303
If I remember correctly, the use of a bit set in this context is to efficiently track which integers appear in the disk file. When passing over the file, the presence of the integer, i, is indicated by setting the ith bit of the array, a, to 1.
For example, if a[0] = 5, the binary value of the 32 bits (remember the BITSPERWORD constant assumes the code is running on a 32 bit machine) at a[0] is 0...01010. This would indicate that the integers 1 and 3 were present in the file, but the integers 2 as well as 4 - 31 were not found in the file.
The steps of the main method are:
Upvotes: 2