Reputation: 593
Here's my code:
def factorize(n):
sieve = [True] * (n + 1)
for x in range(2, int(len(sieve) ** 0.5) + 1):
if sieve[x]:
for i in range(x + x, len(sieve), x):
sieve[i] = False
lowerPrimes = i for i in range(2, len(sieve)) if sieve[i]] and (n % i == 0)]
return lowerPrimes
factorize(n)
returns all prime factors of the given value n
. As you can see, it first makes an Eratosthenes sieve for n
and then uses a list comprehension to return all values in the sieve that are factors of n
. It works relatively fine for this purpose, however, I want it to return a list so that if you multiply every item in it, the result is n
. Do you get my point?
For example, factorize(99020)
returns [2, 5, 4951]
, but I'd like it to return [2, 2, 5, 4951]
, as 2*2*5*4951 = 99020
.
I know my approach is not even close, but could you help me to make it so?
Upvotes: 9
Views: 31851
Reputation: 11
Here is my proposition: a python iterator generating all primes and memoizing them, mutally recursive with a primality test.
class Primes:
memo = [2,3] # need 3 to be able to do += 2 below
def __iter__(self):
self.index = 0
return(self)
def __next__(self):
if self.index < len(Primes.memo):
r = Primes.memo[self.index]
self.index += 1
return r
else:
x = Primes.memo[self.index - 1] + 2
while not (is_prime(x)):
x += 2
Primes.memo.append(x)
self.index += 1
return x
def is_prime(n):
primes = iter(Primes())
a = next(primes)
while (a * a <= n):
if n % a == 0: return False
a = next(primes)
return True
def factorise(n):
primes = iter(Primes())
r = []
a = next(primes)
while (a * a <= n):
if n % a == 0: r.append(a); n = n // a
else: a = next(primes)
r.append(n)
return(r)
def next_prime(n):
if n%2 == 0: n += 1
while not(is_prime(n)): n += 2
return(n)
print(next_prime(1000000000))
Upvotes: 0
Reputation: 203
I developed a more versatile function that also answers this question.
This is not so simple as @Evg's single line lambda method (awesome!), but returns list, dictionary or plain text! However, this function is the slowest method here: @Evg is the fastest, followed close by @Arty. I suggest replacing the list generation step with the @Evg's lambda method.
def factorof(x,mode='p'):
A=[]
for y in range(2,x):
while x % y == 0:
x = x / y
A.append(y)
# prime numbers
if sum(A) == 0:
A.append(x)
mode = str(mode)[0].lower()
if mode in [ 'l', '1' ]:
return A
else:
B = {}
for i in A:
if i in B:
B[i] = B[i] + 1
else:
B[i] = 1
if mode in [ 'd', '2' ]:
return B
else: # mode == 'anything else'; is the default
C = { 0 : "\U00002070", 1 : "\U000000B9", 2 : "\U000000B2",
3 : "\U000000B3", 4 : "\U00002074", 5 : "\U00002075",
6 : "\U00002076", 7 : "\U00002077", 8 : "\U00002078",
9 : "\U00002079" }
D = ''
for i in B:
exp = ''
if B[i] > 1:
for num in str(B[i]):
if int(num) in C:
exp = exp + str(C[int(num)])
D = D + str(i) + exp + " \U000000D7 "
return D[:-3]
Usage:
factorof(99020) # default: plain text
'2² × 5 × 4951'
factorof(99020,1) # can replace 1 by 'list', 'ls' or 'l'
[2, 2, 5, 4951]
factorof(99020,2) # can replace 2 by 'dict' or 'd'
{2: 2, 5: 1, 4951: 1}
Upvotes: 0
Reputation: 71065
The code in the answer by Blender is very nice, but that algorithm is lacking in one very important respect: it tests way too much. E.g. trying to factorize n=392798360393
, which is a prime number, it will try to divide it by all the numbers below it (including itself). This will take a lot of time.
Is it really necessary? If n == a*b
and a < b
, having found a
we don't really need to test divide n
by b
. We know that it too divides n
, because n/a == b
implies n/b == a
. So we only need to test while the potential factor a
is smaller than (or equal to) the potential factor b
. That is to say, until the square root of the number is reached.
Also, instead of starting over from 2 for each reduced n
it's OK to start from the previous value of i
:
from math import sqrt
def factors(n): # (cf. https://stackoverflow.com/a/15703327/849891)
j = 2
while n > 1:
for i in range(j, int(sqrt(n+0.05)) + 1):
if n % i == 0:
n //= i ; j = i
yield i
break
else:
if n > 1:
yield n; break
Indeed, factoring 9000009
by this code takes 0.08 seconds on Ideone, instead of 0.59 seconds.
This is guaranteed to produce only primes (because we divide out each factor found, and we try the candidates in non-decreasing order). The overhead of generating primes first and then testing only by the primes (a.o.t. testing by all numbers, as done here above) will be worth it if we are factoring several numbers at once; for factoring just one number it might not be worth it, depending on the speed of your prime generation.
But then, what really should be done when factoring several numbers at once, is to create the smallest factor sieve first where we mark each number in a given range by its smallest (prime) factor (from which it was generated by the sieve) instead of just by True
or False
as in the sieve of Eratosthenes. This smallest factor sieve is then used for efficient factorization of each of the numbers given, in the same range, by successive division by their factors, from the smallest up, which are efficiently found by a direct look-up in the sieve instead of testing anew:
def sfs_factorize(nums):
top = max(nums)
sv = smallest_factor_sieve(top)
for k in nums:
fs = [] ; n = k
while n > 1:
f = sv[n] # no testing
n //= f
fs.append(f)
print( k, list(fs))
Upvotes: 18
Reputation: 298106
The Sieve of Eratosthenes helps you find prime numbers below a certain limit. It's not really going to help you with finding the factors of a particular number.
If you want to do that, the simplest approach that I can see is something like this:
def factors(n):
while n > 1:
for i in range(2, n + 1):
if n % i == 0:
n //= i
yield i
break
for factor in factors(360):
print factor
This basically finds the smallest factor of n
(which is guaranteed to be prime), divides n
by that number and repeats the process until n
is equal to 1
.
The output is:
2
2
2
3
3
5
They multiply out to the original number:
>>> from operator import mul
>>> reduce(mul, factors(360))
360
Upvotes: 11
Reputation: 16737
Your first prime-sieving part is totally alright. Only second factors-finding part should be corrected.
In second part you should divide by prime factor not just single time, but multiple times till it is still divisible. This will ensure that you take into account powers of all prime factors.
This second part is called trial division by prime numbers. As it is well known it is enough to do trial division only by divisors below Sqrt(N)
. Remaining value of N will be automatically also prime.
One more very important speed improvement to your algorithm is that it is enough to make sieve size Sqrt(N)
, and not N
. Because we need only primes that are less than Sqrt(N)
. This is because after trial division remaining N
is automatically becomes prime.
I also found one more improvement for your sieving part, you should replace for i in range(x + x, ...
with for i in range(x * x, ...
. Because you can start sieving not from x + x
but from x * x
, according to classical algorithm of Sieve of Erathosthenes. This will further improve speed. Starting from x + x
is alright, but x * x
will also give correct results with better performance.
One more obvious improvement is when you want to factorize multiple numbers, then you can reuse and extend sieve multiple times, this will make code faster. When reusing sieve it is good to keep resulting prime numbers in a separate list, so that on second Trial Division stage you don't iterate whole sieve but only prime numbers, this again will give speedup. I didn't do these two improvements in code below, but if you want I can do, just tell.
Full corrected final version is:
def factorize(n):
sieve = [True] * int(n ** 0.5 + 2)
for x in range(2, int(len(sieve) ** 0.5 + 2)):
if not sieve[x]:
continue
for i in range(x * x, len(sieve), x):
sieve[i] = False
factors = []
for i in range(2, len(sieve)):
if i * i > n:
break
if not sieve[i]:
continue
while n % i == 0:
factors.append(i)
n //= i
if n > 1:
factors.append(n)
return factors
print(factorize(99020)) # [2, 2, 5, 4951]
Input:
99020
Output:
[2, 2, 5, 4951]
If you need to factorize just few numbers it might be even faster to do Trial Division Factorization without intermediate sieving stage. This also makes code more simple and shorter.
Full code for doing factorization without sieving stage is below.
In this code I did one obvious speed improvement by trying only odd divisors, this will double total algorithm speed. Of course for doing this you need to factor out all powers of 2 before, which I did too.
def factorize(n):
factors = []
while (n & 1) == 0:
factors.append(2)
n >>= 1
for d in range(3, 1 << 60, 2):
if d * d > n:
break
while n % d == 0:
factors.append(d)
n //= d
if n > 1:
factors.append(n)
return factors
print(factorize(99020)) # [2, 2, 5, 4951]
Upvotes: 2
Reputation: 3080
My one-liner with python 3.8 (assignment expressions used)
f = lambda n: (p:=[next(i for i in range(2, n+1) if n % i == 0)] if n>1 else [])+(f(n//p[0]) if p else [])
video with details - Factorizing number in Python
Upvotes: 1
Reputation: 2349
I'm not familiar enough to tell if this question should be deleted or whatever, but I'll help out anyhow.
You have the sieve part correct, I think.
The key is to use a while
loop to keep on dividing out a valid prime factor more than once.
factors = []
sieve[0] = sieve[1] = False # So I don't have to worry about trying to skip over these two
for testFactIndex, isPrime in enumerate(sieve):
if isPrime:
while n%testFactIndex == 0:
n = n/testFactIndex
factors.append(testFactIndex)
return factors
Upvotes: 1