Diaelectics
Diaelectics

Reputation: 31

Order two sorted generators without imports

I'm working on a problem from stuy's coding problems and came across this one.

So given two generators that each output numbers in increasing order, merge the two generators into one generator that outputs the numbers in increasing order. If duplicates occur, output the number as many times as it occurs.

My attempt: Since I'm more familiar with working with lists, tuples, dictionaries, etc, I thought I'd just make a helper to create a list of items in the generators. Then I'd merge the two lists and sort them

def list_maker(gener):
    l1 = []
    for item in gener:
        l1.append(item)
    return l1

def merge_gens(first_gen, second_gen):
    first_list = list_maker(first_gen)
    second_list = list_maker(second_gen)

    first_list.extend(second_list)
    final_list = first_list
    final_list.sort()

    yield from final_list

Although this approach seems to work on finite generators, it does not on infinite generators(which I forgot to account for). I obviously can't have a list of infinite items. Could I get help on how to do this without importing python libraries?

Upvotes: 1

Views: 446

Answers (2)

abhilb
abhilb

Reputation: 5757

You can try :

def merge(first, second):
    a = next(first)
    b = next(second)
    while(True):
        # yield the smaller one
        yield a if a < b else b

        # get the next number from the
        # generator that yielded the smaller one
        if a < b:
            a = next(first)
        elif a==b:
            # when the numbers are equal
            # yield second number a second time
            yield a
            # get the next numbers from both the generators. 
            a = next(first)
            b = next(second)
        else:
            b = next(second)

Sorry for the lack of comments and explanation. I haven't tested edge cases. I hope you get the general gist of the approach and would help you get the pointers to work on your task further.

Assumption - StopIteration exceptions will be handled by the callee

Upvotes: 1

juanpa.arrivillaga
juanpa.arrivillaga

Reputation: 95948

This was a bit tricky to handle the edge cases, I had fun with this. Haven't tested it fully, and it's in a pretty verbose state right now, a couple helper functions could add clarity:

def merge(first, second):
    first = iter(first)
    second = iter(second)
    exhausted = object()
    f = next(first, exhausted)
    if f is exhausted:
        yield from second
    s = next(second, exhausted)
    if s is exhausted:
        yield f
        yield from first
        return
    while True:
        if f is exhausted:
            if s is not exhausted:
                yield s
            yield from second
            return
        elif s is exhausted:
            if f is not exhausted:
                yield f
            yield from first
            return
        elif f < s:
            yield f
            f = next(first, exhausted)
        elif f == s:
            yield f
            yield s
            f = next(first, exhausted)
            s = next(second, exhausted)
        else:
            yield s
            s = next(second, exhausted)

I think the following makes it more readable by removing some of the deeper nesting and re-using logic:

def merge(first, second):
    first = iter(first)
    second = iter(second)
    exhausted = object() # just a unique sentinel value

    def _cleanup(item, iterator):
        if item is not exhausted:
           yield item
           yield from iterator

    f = next(first, exhausted)
    if f is exhausted:
        yield from second
    s = next(second, exhausted)
    if s is exhausted:
        yield from _cleanup(f, first)
        return

    while True:
        if f is exhausted:
            yield from _cleanup(s, second)
            return
        elif s is exhausted:
            yield from _cleanup(f, first)
            return
        elif f < s:
            yield f
            f = next(first, exhausted)
        elif f == s:
            yield f
            yield s
            f = next(first, exhausted)
            s = next(second, exhausted)
        else:
            yield s
            s = next(second, exhausted)

The key idea is to keep asking for a value from each of the iterators, yielding the smallest item (or if they are equal, yield both items), and only drawing from the iterator that gave you the smallest item (or from both if they are equal) until one iterator is exhausted then you clean it all up by delegating to the other.

Upvotes: 0

Related Questions