johnSmith563
johnSmith563

Reputation: 53

Representing prime numbers as continous bit array in C

I am trying to make a very large bit array where the position of the bit in memory is the number, and the value of the bit (0 or 1) is the state of that number (prime/not prime). I am trying it like this:

#define SIZE 2

int* A = calloc(SIZE, 4);
//retrieve the state of a number (True if not prime, false if prime)
if(A[0] & (1 << 0)){} //will give me the state for 0
if(A[0] & (1 << 1)){} //will give me the state for 1
if(A[0] & (1 << 2)){} //will give me the state for 2
if(A[0] & (1 << 7)){} //will give me the state for 7
if(A[0] & (1 << 8)){} //will give me the state for 8
if(A[0] & (1 << 31)){} //will give me the state for 31

//Now can I do this??
if(A[0] & (1 << 32)){} //will give me the state for 32
//Instead of doing this??

if(A[1] & (1 << 0)){} //will also give me the state for 32


//Ultimately
if(A[0] & (1 << n)){} //will give me the n-th bit of my continuous block of memory??

I want to avoid indexing as that would require me to use things like division and modulo. My question is am I able to access the bits of next element of an array using bit shift operations? and ultimately change them?

A[0] = A[0] | (1 << n);

here is my full code:

/*
 *
 * @author Matthew Pecko
 * 12/29/2020
 *
 * Seive of Eratosthenes in C
 *
 * Define MIN as 0, and MAX to be the upper bound
 * of the prime range (in decimal)
 *
 * Define diff as (max - min) + 1
 *
 * The number of bytes we will need can be found:
 * numberOfBytes = diff/32
 * trueNumber = ceil(numberOfBytes)
 *
 * Calulate the true number of bytes and define it as SIZE
 *
 * A number's state can be retrieved:
 * state[0] & (1 << 0) -- will get state for 0
 * state[0] & (1 << 1) -- will get state for 1
 * state[0] & (1 << 2) -- will get state for 2
 * state[0] & (1 << n) -- will get state for n
 *
 *
 *
 * We want to seive the range from [0,sqrt(diff)]
 * we will stick with [0,diff] for now
 *
 */


#include "stdlib.h"
#include "stdio.h"

#define MIN 2
#define MAX 70
#define SIZE 1

void print(int* a);

int main(){
    //int* state = calloc(SIZE, 4);
    int* state = malloc(sizeof(int)*SIZE);
    int i, j, counter;

    for(i = 0; i < SIZE; ++i) state[i] = 0; 
        
    i = 2;
    while(i<=MAX){
        //printf("ADEBUG i: %d \n", i);
        if(!(state[0] & (1 << i))){
            j = i*i;
        //printf("DEBUG i: %d \n", i);
        //printf("DEBUG j: %d \n", j);
            counter = 0;
            while(j <= MAX+1){
                state[0] = state[0] | (1 << (j));
                counter++;
                printf("number I am on: %d \n", i);
                printf("multiple: %d \n", counter);
                printf("number to toggle (j): %d \n", j);
                j = (i*i) + counter*i;
            }   
        }
        printf("STEP: %d RESULT--------------------------- \n", i);
        print(state);
        i++;
    }
    printf("\n\n\n\n\n\n");
    printf("RESULT--------------------------- \n");
    print(state);
}

void print(int* a){
    int i;
    i = 2;
    do{
        if(!(a[0] & (1 << i))){
            printf("The state of %d is: \t true \n", i);
        }else{
            printf("The state of %d is: \t false \n", i);
        }   
        i++;
    }while(i <= MAX);
}

I understand there are some differences between malloc() and calloc(), so I tried using both:

int* state = malloc(sizeof(int)*SIZE);
for(i = 0; i < SIZE; ++i) state[i] = 0; 

and

int* state = calloc(SIZE, 4);

I believe my issue lies in the fact that I cannot access a continuous block of memory using only bit shifting operations, or the fact that I am using gcc and it might use little endian bit rep. I would really like to avoid using division and modulo to find my index for state[index], however I can try to do this and then try and change it to bitshift left operations since I believe I will only be dividing by numbers that can be represented as 2^n.

Upvotes: 1

Views: 136

Answers (2)

Matt Timmermans
Matt Timmermans

Reputation: 59174

If you're going to be manipulating individual bits, then you should be comfortable using bitwise operations to shift and mask instead of using / and %.

As @tstanisl indicates, you should use a fixed-size data type instead of int, and it's a little clearer to use an unsigned type if you aren't treating the sign bit as a sign.

For 32=bit words, for example:

static inline void setBit(uint32_t *data, size_t bitIndex)
{
    data[bitIndex>>5] |= ((uint32_t)1) << (bitIndex&31);
}

static inline void clearBit(uint32_t *data, size_t bitIndex)
{
    data[bitIndex>>5] &= ~(((uint32_t)1) << (bitIndex&31));
}

static inline bool getBit(uint32_t *data, size_t bitIndex)
{
    return (bool)((data[bitIndex>>5] >> (bitIndex&31)) & 1);
}

Upvotes: 1

tstanisl
tstanisl

Reputation: 14107

Do not be afraid of modulo operations with constant operands (like x % 8). Any modern compiler will optimize it as bitwise-and with a mask. See https://godbolt.org/z/ehjMbo

I suggest using a long fixed-size unsigned integer type like uint64_t from stdint.h. Use uint32_t if you use 32-bit machine.

Do not use int due to implementation-specific size and a risk of undefined behaviors due to overflows like 1<<32 or even 1 << 31.

int limit = ...;
uint64_t *data = calloc((limit + 63) / 64, sizeof *data);

// reading bit X
bool bitX = (data[X / 64u] >> (X % 64u)) & 1;

// setting bit Y
data[Y / 64u] |= (1ull << (Y % 64u));

64u is used to force promoting X to unsigned type to let use the optimal division algorithm.

Upvotes: 3

Related Questions