Reputation: 605
(This is Project Euler Problem 14)
I'm trying to diversify my Python skills by using Itertools.
I want to get the index of a maximum returned value in a set number of iterations (1,000,000) from a generator in Python's itertools.
The function works, butt I can't figure out how to get to effectively get the maximum value without storing it in a massive one-million long list which I think could hardly be efficient to this. Is there an intelligent smart way? (edit: Maybe this is the easiest way?)
The current code is that at the moment it stops when the chain reaches 100 which is wrong.
#Project Euler Problem 14
#Which starting number, under one million, produces the longest chain?
import itertools
def length():
x=1
while True:
l=0
c=x #Create the running integer.
while c>1:
if c%2==0:
#Number is even, divide it by two.
c=c/2
l+=1
else:
#Number is odd, multiply by three and add one.
c=3*c+1
l+=1
yield l+1 #Add the final one as well and show the chain length.
x+=1 #Increment
print list(itertools.takewhile(lambda x:x<100,length()))
Upvotes: 1
Views: 416
Reputation: 877
I can't see a reason to use itertools.takewhile here. Takewhile allows for early termination, but you can't terminate early since you must check collatz(n)
for each n
under one million.
The standard max function will find the highest value in an iterable. A range object is an iterable that works in constant memory (ie., it doesn't use an internal list of all the values in the range.). I normally wouldn't post a solution publicly, but since this is only one of the first twenty problems, and there are already three distinct solutions posted here...
def collatz(n):
return n // 2 if n%2 == 0 else 3*n + 1
def distance(n, cache={1:1}):
if n not in cache: cache[n] = distance(collatz(n)) + 1
return cache[n]
ans = max(range(1,1000000), key=distance)
print(ans)
You really should just use the standard max function. You don't know how long the longest chain will be, so taking the first chain longer than 100 is not guaranteed to work. It's just guess-and-check. Max finds the longest without storing everything in memory.
I should leave a disclaimer that I memoize 'distance'. That means that my code stores at least 1,000,000 values so that collatz(n) doesn't need to be re-calculated repeatedly. This costs a few MB of ram, but cuts execution from over a full minute to just a few seconds. It's very much worth it.
Upvotes: 1
Reputation: 122116
To add a second answer with a different approach; this one will minimise the memory footprint using a generator to produce the collatz
sequence.
However, it is much slower. Running this takes ~60s, compared to ~5s for the first run of the memoize
d version (and ~1s for subsequent runs, once the cache
has all the values in),
def collatz(n):
while n > 1:
yield n
n = (n * 3) + 1 if n % 2 else n // 2
yield 1
def euler_014_iter(max_=1000000):
longest = max_len = None
for n in range(1, max_):
length = sum(1 for _ in collatz(n))
if max_len is None or length > max_len:
longest, max_len = n, length
return longest
Upvotes: 1
Reputation: 122116
I approached the problem as follows:
import functools
def euler_014(max_=1000000):
longest = max_len = None
for n in range(1, max_):
length = len_collatz(n)
if max_len is None or length > max_len:
max_len, longest = length, n
return longest
def memoize(f):
cache = {}
@functools.wraps(f)
def func(*args):
if args not in cache:
cache[args] = f(*args)
return cache[args]
return func
@memoize
def len_collatz(n):
if n == 1:
return 1
if n % 2:
n = (n * 3) + 1
else:
n //= 2
return 1 + len_collatz(n)
Here memoize
makes things more efficient by storing previous results for len_collatz
, len_collatz
tells you the length of the sequence from n
to 1
(without actually generating a list of the whole sequence) and euler_014
simply has to keep track of the longest length and the associated value of n
. Note that it doesn't keep all of the results - just the length of the longest series so far and the value of n
that produced that sequence.
Upvotes: 1