Reputation: 20856
I need to compare 2 tables of similar schema and have 2 generator objects..How do I compare these 2 generators row by row in Python. Need to implement the file comparison logic,
If generator-object-1 = generator-object-1:
then read-next-row-generator-object-1,read-next-row-generator-object-1
elif generator-object-1 > generator-object-2:
then read-next-row-generator-object-2
elif generator-object-1 < generator-object-2
then read-next-row-generator-object-1
Is there any better way to do in Python?
Upvotes: 2
Views: 98
Reputation: 1122222
I used this in the past:
import operator
def mergeiter(*iterables, **kwargs):
"""Given a set of sorted iterables, yield the next value in merged order"""
iterables = [iter(it) for it in iterables]
iterables = {i: [next(it), i, it] for i, it in enumerate(iterables)}
if 'key' not in kwargs:
key = operator.itemgetter(0)
else:
key = lambda item, key=kwargs['key']: key(item[0])
while True:
value, i, it = min(iterables.values(), key=key)
yield value
try:
iterables[i][0] = next(it)
except StopIteration:
del iterables[i]
if not iterables:
raise
This would list items from the given iterables in sorted order, provided the input iterables are themselves already sorted.
The above generator would iterate over your two generators in the same order as your psuedo-code would.
Upvotes: 3
Reputation: 309929
There isn't really too much of a better way...
go1 = next(generator1)
go2 = next(generator2)
try:
while True
if go1 == go2:
go1 = next(generator1)
go2 = next(generator2)
elif go1 > go2:
go2 = next(generator2)
elif go1 < go2:
go1 = next(generator1)
except StopIteration
pass #Done now ...
Of course, what you're describing here is really the merge stage of a merge sort (or at least that's how it seems) -- Although you don't yield the rest of the objects after one generator is exhausted. CPython's builtin sort is very merge-like (Tim-sort is a hybrid of insertion sort and merge sort). So, in this case, if you don't mind having a list at the end, you could just do:
import itertools as it
sorted(it.chain(generator1,generator2))
and Bob's your uncle.
Upvotes: 0