Danial
Danial

Reputation: 622

sieve upto 2 billion gives segmentation fault

I am using this program to check a number if prime or not.

Use algorithm - Sieve :

#include<bits/stdc++.h>
//#define _max    2000000001
#define _max    20000001
using namespace std;
bool sieve[_max];
void init()
{
    memset(sieve,true,sizeof(sieve));
    sieve[0]=sieve[1]=false;
    for(int i=2;i<_max;i+=2)
    {
        sieve[i]=false;
    }
}
void go_sieve(int n)
{
    n++;
    for(int i=3;i<n;i+=2)
    {
        if(sieve[i]==false)
            continue;
        for(int j=2*i;j<n;j+=i)
            sieve[j]=false;
    }
}
void print(int n)
{
    n++;
    printf("-------------\n");
    for(int i=0;i<n;i++)
    {
        if(sieve[i])
            cout << i << " ";
    }
    printf("\n-------------\n");
}
int main()
{
    init();
    int n;
    scanf("%d",&n);
    while(n--)
    {
        int x;
        scanf("%d",&x);
        go_sieve(x);
        //print(x);
        if(sieve[x])
            printf("Prime\n");
        else
            printf("Not prime\n");
    }
    return 0;
}

Now it works upto 2e7 and pretty smoothly, but I want to check upto 2e9, if I change my _max to 2000000001 it gives me segmentation error and exits with an error code.

How can I resolve this problem ?

I have tried a new approach with set :

#include<bits/stdc++.h>
//#define _max    200001
//#define _max    20000001
#define _max    2000000001
using namespace std;
set<int>prime;
set<int>nprime;
void init()
{
    prime.insert(2);
}
void go_sieve()
{
    for(int i=3;i<_max;i+=2)
    {
        if(prime.find(i)==prime.end() && nprime.find(i)==nprime.end())
        {
            prime.insert(i);
            //cout << i << endl;
            for(int j=2*i;j<_max;j+=i)
                nprime.insert(j);
        }
        if(nprime.find(i)!=nprime.end())
            nprime.erase(nprime.find(i));
    }
}
void print()
{
    set<int> ::iterator itt;
    printf("-------------\n");
    for(itt=prime.begin();itt!=prime.end();itt++)
    {
        cout << *itt << " ";
    }
    printf("\n-------------\n");
}
int main()
{
    init();
    go_sieve();
    //print();
    int n;
    scanf("%d",&n);
    while(n--)
    {
        int x;
        scanf("%d",&x);
        if(prime.find(x)!=prime.end())
            printf("Prime\n");
        else
            printf("Not prime\n");
    }
    return 0;
}

Target is to execute it within 512MB~1GB memory.

Upvotes: 1

Views: 204

Answers (3)

user448810
user448810

Reputation: 17866

If you want to enumerate large ranges of prime numbers, you should use a segmented Sieve of Eratosthenes; it will be faster (due to caching effects) and use less memory.

If you only want to determine if one number is prime, or a few numbers, sieving is a horrible way to do it. Sieving should only be used when you are interested in an entire range of numbers. For n up to a billion, trial division is simple and probably fast enough. For larger numbers, a Miller-Rabin test or Baillie-Wagstaff test is probably better.

Upvotes: 1

VLL
VLL

Reputation: 10165

You probably hit some memory limit on your system which causes the segmentation fault.

However, you don't need such a big array. Using Sieve of Eratosthenes, you need to calculate numbers up to x. Instead of an array you can use std::vector and increase its size as you calculate more numbers. This should allow you to calculate some numbers, but with large numbers you will hit the memory limit again.

You could also use some algorithm which requires you to store fewer numbers. To determine whether x is prime, you only need to compare against prime numbers that are smaller than the square root of x. You don't have to store numbers that are not primes. With x = 1e10, you would only need to store 5e8 numbers.

Here is some example with vector (probably not optimal):

#include <iostream>
#include <vector>
#include <algorithm>
#include <cmath>

std::vector<int> primes = {2};

void calculate(int x) {
    const int largest_prime = primes.back();
    if (largest_prime >= x) {
        // Already calculated
        return; 
    }
    for (size_t i = largest_prime + 1; i <= x; i++) {
        bool not_prime = false;
        for (size_t j = 0; j < primes.size(); j++) {
            if (i % primes[j] == 0) {
                not_prime = true;
                break;
            }
        }
        if (!not_prime) {
            primes.push_back(i);
        }
    }
}

bool check(int x) {
    calculate(x);
    return std::find(primes.begin(), primes.end(), x) != primes.end();
}

int main() {
    std::cout << check(15) << std::endl;
    std::cout << check(256699) << std::endl;
}

Upvotes: 0

4386427
4386427

Reputation: 44329

I can't reproduce this on my system. My guess is that this has to do with a system dependant limitation.

You declare sieve as a global array (static storage duration) and it's huge (i.e. 2000000001 * sizeof(bool) - could be 2-8G depending on sizeof bool). Maybe your system can't handle that.

Instead of a global array, try using dynamic allocation:

// bool sieve[_max]; comment out this
bool* sieve = NULL;

...
...

int main()
{
    sieve = (bool*)malloc(_max * sizeof *sieve);
    if (sieve == NULL)
    {
        // out of memory
        exit(1);
    }

    ...

That said:

Your code is C++ but your style is more C like.

In C++ you would probably use a std::vector instead. That would make everything much easier.

BTW: Also avoid globals. Instead define the vector (or dynamic array) in main and pass it by-reference to the functions.

Upvotes: 0

Related Questions