Reputation: 31

Optimizing a Prime Number Factorization algorithm

The following below is an algorithm that finds the prime factorization for a given number N. I'm wondering if there are any ways to make this faster using HUGE numbers. I'm talking like 20-35 digit numbers. I wanna try and get these to go as fast as possible. Any ideas?

import time

def prime_factors(n):
    """Returns all the prime factors of a positive integer"""
    factors = []
    divisor = 2        
    while n > 1:        
        while n % divisor == 0:
            factors.append(divisor)
            n /= divisor          
        divisor = divisor + 1
        if divisor*divisor > n:
            if n > 1: 
                factors.append(n)
            break
    return factors

#HUGE NUMBERS GO IN HERE!
start_time = time.time()
my_factors = prime_factors(15227063669158801)
end_time = time.time()
print my_factors
print "It took ", end_time-start_time, " seconds."

Upvotes: 1

Answers (4)

Nuclearman

Reputation: 5314

No optimizations to that algorithm will allow you to factor 35 digit numbers at least in the general case. The reason is that the number of primes up to 35 digits are too high to be listed in a reasonable amount of time let alone attempt to divide by each one. Even if one was inclined to try, the number of bits required to store them would be far too much as well. In this case you'll want to select a different algorithm from the list of general purpose factorization algorithms.

However, if all the prime factors are small enough (say below 10^12 or so), then you could use a segmented Sieve of Eratosthenes, or simply find a list of primes up to some practical number (say 10^12 or so) online and use that instead of trying to calculate the primes and hope the list is large enough.

Upvotes: 0

user448810

Reputation: 17866

Your algorithm is trial division, which has time complexity O(sqrt(n)). You can improve your algorithm by using only 2 and the odd numbers as trial divisors, or even better by using only prime numbers as trial divisors, but the time complexity will remain O(sqrt(n)).

To go faster you need a better algorithm. Try this:

def factor(n, c):
    f = lambda(x): (x*x+c) % n
    t, h, d = 2, 2, 1
    while d == 1:
        t = f(t); h = f(f(h)); d = gcd(t-h, n)
    if d == n:
        return factor(n, c+1)
    return d

To call it on your number, say

print factor(15227063669158801, 1)

That returns the (possibly composite) factor 2090327 virtually instantly. It uses an algorithm called the rho algorithm, invented by John Pollard in 1975. The rho algorithm has time complexity O(sqrt(sqrt(n))), so it's much faster than trial division.

There are many other algorithms for factoring integers. For numbers in the 20 to 35 digit range that interests you, the elliptic curve algorithm is well-suited. It should factor numbers of that size in no more than a few seconds. Another algorithm that is well-suited to such numbers, especially those that are semi-primes (have exactly two prime factors), is SQUFOF.

If you're interested in programming with prime numbers, I modestly recommend this essay on my blog. When you're finished with that, other entries on my blog talk about elliptic curve factorization, and SQUFOF, and various other even more-powerful methods of factoring ever-larger integers.

Upvotes: 1

Lorientas

Reputation: 66

It seems like there is no check for divisors. Sorry if I am wrong but how do you know if divisor is prime or not? Your divisor variable is increasing by 1 after each loop so I assume it will generate a lot of composite numbers.

Upvotes: 0

AechoLiu

Reputation: 18428

For example, list all prime factorization for a number 100.

Check 2 is one of factorizations or not. And then, 2 < 2*c <= 100 could be removed. Ex, 4, 6, 8, ... 98
Check 3 is one of factorizations or not. And then, 3 < 2*d <= 100 could be removed. Ex, 9, 12, ... 99
4 is removed from possible set.
Check 5, And then, 10, 15, 20, ..., 100 are removed.
6 is removed.
Check 7, .... ....

Upvotes: 0

Optimizing a Prime Number Factorization algorithm

Answers (4)

Related Questions