Reputation: 31
I'm working on a problem from stuy's coding problems and came across this one.
So given two generators that each output numbers in increasing order, merge the two generators into one generator that outputs the numbers in increasing order. If duplicates occur, output the number as many times as it occurs.
My attempt: Since I'm more familiar with working with lists, tuples, dictionaries, etc, I thought I'd just make a helper to create a list of items in the generators. Then I'd merge the two lists and sort them
def list_maker(gener):
l1 = []
for item in gener:
l1.append(item)
return l1
def merge_gens(first_gen, second_gen):
first_list = list_maker(first_gen)
second_list = list_maker(second_gen)
first_list.extend(second_list)
final_list = first_list
final_list.sort()
yield from final_list
Although this approach seems to work on finite generators, it does not on infinite generators(which I forgot to account for). I obviously can't have a list of infinite items. Could I get help on how to do this without importing python libraries?
Upvotes: 1
Views: 446
Reputation: 5757
You can try :
def merge(first, second):
a = next(first)
b = next(second)
while(True):
# yield the smaller one
yield a if a < b else b
# get the next number from the
# generator that yielded the smaller one
if a < b:
a = next(first)
elif a==b:
# when the numbers are equal
# yield second number a second time
yield a
# get the next numbers from both the generators.
a = next(first)
b = next(second)
else:
b = next(second)
Sorry for the lack of comments and explanation. I haven't tested edge cases. I hope you get the general gist of the approach and would help you get the pointers to work on your task further.
Assumption - StopIteration exceptions will be handled by the callee
Upvotes: 1
Reputation: 95948
This was a bit tricky to handle the edge cases, I had fun with this. Haven't tested it fully, and it's in a pretty verbose state right now, a couple helper functions could add clarity:
def merge(first, second):
first = iter(first)
second = iter(second)
exhausted = object()
f = next(first, exhausted)
if f is exhausted:
yield from second
s = next(second, exhausted)
if s is exhausted:
yield f
yield from first
return
while True:
if f is exhausted:
if s is not exhausted:
yield s
yield from second
return
elif s is exhausted:
if f is not exhausted:
yield f
yield from first
return
elif f < s:
yield f
f = next(first, exhausted)
elif f == s:
yield f
yield s
f = next(first, exhausted)
s = next(second, exhausted)
else:
yield s
s = next(second, exhausted)
I think the following makes it more readable by removing some of the deeper nesting and re-using logic:
def merge(first, second):
first = iter(first)
second = iter(second)
exhausted = object() # just a unique sentinel value
def _cleanup(item, iterator):
if item is not exhausted:
yield item
yield from iterator
f = next(first, exhausted)
if f is exhausted:
yield from second
s = next(second, exhausted)
if s is exhausted:
yield from _cleanup(f, first)
return
while True:
if f is exhausted:
yield from _cleanup(s, second)
return
elif s is exhausted:
yield from _cleanup(f, first)
return
elif f < s:
yield f
f = next(first, exhausted)
elif f == s:
yield f
yield s
f = next(first, exhausted)
s = next(second, exhausted)
else:
yield s
s = next(second, exhausted)
The key idea is to keep asking for a value from each of the iterators, yielding the smallest item (or if they are equal, yield both items), and only drawing from the iterator that gave you the smallest item (or from both if they are equal) until one iterator is exhausted then you clean it all up by delegating to the other.
Upvotes: 0