Reputation: 69
I wrote two codes with almost same structure,
def prime_gen1(Limit = 10000):
List = [2,]
for x in range(3,Limit):
for y in List:
if not x%y:
break
if not x%y:
continue
else:
List.append(x)
yield x
def prime_gen2(Limit = 10000):
from math import floor
for x in range(3,Limit):
for y in range(2, floor(x**0.5)+2):
if not x%y:
break
if not x%y:
continue
else:
yield x
>>> list(prime_gen1(20000)) == list(prime_gen2(20000))
True
>>> def time1(number):
st = time()
list(prime_gen1(number))
end = time()
return end - st
>>> def time2(number):
st = time()
list(prime_gen2(number))
end = time()
return end - st
One does same work as other, but the latter actually works much faster. I'm wondering why this happens.
Logically - or nonlogically, I thought checking with primes will outdo the other way, in this case- checking by numers between 3 and root of number.But time-checking showed vice versa, checking with all the numers works much faster - about 5 times. Its performance increasingly differs,
>>> time1(200000)
8.185129404067993
>>> time2(200000)
0.4998643398284912
Second method is outdoing it. What makes this different?
Upvotes: 1
Views: 84
Reputation: 12157
Some better timings
%timeit list(prime_gen1(10**5))
2.77 s ± 204 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
%timeit list(prime_gen2(10**5))
219 ms ± 10.5 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
You take a number of optimisation steps in your second algorithm but you don't in your first: here are some problems with your first algorithm.
def prime_gen1(Limit = 10000):
List = [2,]
for x in range(3,Limit): # we're checking all even numbers too
for y in List: # only need to check up to sqrt(x)!!
if not x%y:
break
if not x%y: # why are we checking again? Use a for-else construct
continue
else:
List.append(x) # just return the list at the end
yield x # when wrapped in list just copies List
Here is an optimised version of the first algorithm (not a generator because a generator from a list is just pointless):
def memeff_primelist(n):
if n <= 2:
return []
primes = [] # Add 2 in at the end, don't need to check non-even
for i in range(3, int(n), 2):
for p in primes:
if i % p == 0: # non-prime
break
if p * p > i: # no factors bigger than sqrt(i)!!!
primes.append(i)
break
else:
primes.append(i) # only for i == 3
primes.insert(0, 2)
return primes
%timeit memeff_primelist(10**5)
88.9 ms ± 16.5 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
Upvotes: 1
Reputation: 2706
the list version does a lot more checks than the one going only to a square root of the number
for limit 200000 the square root is ~447 there are 17983 prime numbers smaller than 200000
just add a count of how many times you do the x%y check like
def prime_gen1(Limit = 10000):
List = [2,]
modulo_checks = 0
for x in range(3,Limit):
for y in List:
modulo_checks += 1
if not x%y:
break
if not x%y:
continue
else:
List.append(x)
yield x
print(modulo_checks)
def prime_gen2(Limit = 10000):
from math import floor
modulo_checks = 0
for x in range(3,Limit):
for y in range(2, floor(x**0.5)+2):
modulo_checks += 1
if not x%y:
break
if not x%y:
continue
else:
yield x
print(modulo_checks)
now for limit 200000 the version 1 does 162416226 checks and the second one 7185445
if you add an early break for looping of the list the list version is significantly faster(2times as fast 1799767 checks 0.24s vs 7185445 checks 0.64s)
...
sq_root = floor(x ** 0.5) + 2
for y in List:
modulo_checks += 1
if not x % y or y > sq_root:
break
...
and remove the math import if you want to compare algorithm speeds
Upvotes: 4