Reputation: 217
I've written the following C++ code to factorize really large numbers efficiently (numbers up to 24997300729). I have a vector containing 41000 primes approx.( I know having such a large vector isn't a good idea although but couldn't figure a way around this). This code produces the prime factorization of moderately large numbers in no time but when it comes to numbers such as, 24997300572 the program stalls.
Here's the program below with some screenshots of the output:
#include <iostream>
#include <vector>
#include <algorithm>
#include <iterator>
#include <cmath>
using namespace std;
vector<int> primes = {paste from
https://drive.google.com/file/d/1nGvtMMQSa9YIDkMW2jgEbJk67P7p54ft/view?usp=sharing
};
void factorize(int n) {
if (n == 1)
return;
if (find(primes.begin(), primes.end(), n) != primes.end()) {
cout << n <<" "; //if n is prime dont'proceed further
return;
}
//obtaining an iterator to the location of prime equal to or just greater than sqrt(n)
auto s = sqrt(n);
vector<int>::iterator it = lower_bound(primes.begin(), primes.end(), s);
if (it == primes.end()) {
return; // if no primes found then the factors are beyond range
}
for (auto i = it;i != primes.begin();i--) {
if (n % *i == 0)
{
cout << *i << " ";
n = n / (*i);
factorize(n);
return; // the two consecutive for() loops should never run one after another
}
}
for (auto i = it;i != primes.end();i++) {
if (n % *i == 0)
{
cout << *i << " ";
n = n / (*i);
factorize(n);
return; // the two consecutive for() loops should never run one after another
}
}
}
int main() {
unsigned int n;
cout << "Enter a number between 1 and 24997300729 ";
cin >> n;
if (n > 24997300729) {
cout << "Number out of range;";
exit(-1);
}
factorize(n);
return 0;
}
This is OK
But This is NOT!!!
I tried using long long int
and long double
wherever I could to over come the problem of large numbers, but that didn't help much.
Any help Would Be Greatly Appreciated
Upvotes: 3
Views: 2327
Reputation: 490338
It's a little unclear (at least to me) exactly why you've structured the program the way you have.
You can fully factor a number by only looking for prime factors less than or equal to that number's square root. Any prime factor larger than those pairs with one prime factors smaller than that, so you only have to search for those to find all the prime factors. Any remaining factors can be obtained by simple division, not searching.
I'd probably generate the base of prime numbers on the fly (mostly likely using a sieve). The square root of 24'997'300'729 is (about) 158'105. A quick test shows that even without any work on optimization, a sieve of Eratosthenes will find the primes up to that limit in about 12 milliseconds.
Personally, I'd rather not have a fixed limit on the largest number the user can factor, other than the limit on the size of number we're working with, so if the user enters something close to the limit for a 64-bit number, we find all the primes that fit in 32 bits, and then use those to factor the number. This will obviously be slower than if we don't find as many primes, but a user probably won't be too surprised at the idea that factoring a larger number takes longer than factoring a smaller number.
So, implementing that, we might end up with code something like this:
#include <iostream>
#include <locale>
#include <vector>
#include <string>
using Number = unsigned long long;
auto build_base(Number limit) {
std::vector<bool> sieve(limit / 2, true);
for (Number i = 3; i < limit; i += 2) {
if (sieve[i / 2]) {
for (Number temp = i * i; temp < limit; temp += i)
if (temp & 1)
sieve[temp / 2] = false;
}
}
return sieve;
}
void factor(Number input, std::vector<bool> const &candidates)
{
while (input % 2 == 0) {
std::cout << 2 << "\t";
input /= 2;
}
for (Number i = 1; i < candidates.size(); i++) {
if (candidates[i]) {
auto candidate = i * 2 + 1;
while ((input % candidate) == 0) {
std::cout << candidate << "\t";
input /= candidate;
}
}
}
if (input != 1)
std::cout << input;
}
int main(int argc, char **argv) {
std::cout.imbue(std::locale(""));
if (argc != 2) {
std::cerr << "Usage: factor <number>\n";
return EXIT_FAILURE;
}
auto number = std::stoull(argv[1]);
auto limit = std::sqrt(number) + 1;
auto candidates = build_base(limit);
factor(number, candidates);
}
At a high level, the code works like this: we start by finding the primes up to the square root of the number the user entered. Since we want all the primes up to a limit, we use a sieve of Eratosthenes to find them. This builds a vector of bools, in which vector[n]
will be true if n
is prime, and false if n
is composite. It does this starting from 3 (2 is a special case we kind of ignore for now) and crossing off the multiples of three. Then it finds the next number that hasn't been crossed off (which will be five, in this case), and crosses off its multiples. It continues doing that until it reaches the end of the array. To save some space, it leaves all the even numbers out of the array, because (other than that special case for 2) we already know none of them is prime.
Once we have that, we use those prime numbers to find prime factors of the number we want to factor. This proceeds pretty simply: walk through the vector of primes, and test whether each prime number divides evenly into the target number. If it does, print it out, divide it out of the target number, and continue.
At least for me, this seems to work pretty dependably, and is reasonably fast. If we wanted to do a better job of factoring larger numbers, the next big step would be to switch to a segmented sieve. This can improve the speed of the first part of the job by a pretty wide margin, allowing us (for example) to factor anything that'll fit into a 64-bit number in no more than about 10 seconds.
Upvotes: 3