Reputation: 3336
Is there a concise and memory efficient way to find out whether two iterators lines1
and lines2
yield the same items?
For example, these iterators could be lines retrieved from a file object:
with io.open(`some.txt`, 'r', encoding='...') as lines1:
with io.open(`other.txt`, 'r', encoding='...') as lines2:
lines_are_equal = ...
Intuitively one could expect that
lines_are_equal = lines1 == lines2 # DOES NOT WORK
would give the desired result. However this will always be False
because it only compares the addresses of the iterators instead of the items yielded.
If memory would not be an issue, one could convert the iterators to lists and compare them:
lines_are_equal = list(lines1) == list(lines2) # works but uses a lot of memory
I already checked itertools
, expecting to find something like
lines_are_equal = itertools.equal(lines1, lines2) # DOES NOT WORK
but there does not seem to be any function like that.
The best I could come up so far is looping over both iterators using itertools.zip_longest()
(Python 2: izip_longest()
):
lines_are_equal = True
for line1, line2 in itertools.zip_longest(lines1, lines2):
if line1 != line2:
lines_are_equal = False
break
This does give the desired result and is memory efficient however it feels clumsy and unpythonic.
Is there a better way to do this?
Solution: Applying the collective wisdom from the comments and answer this is the one line helper function that works even if the two iterators are the same or can have trailing None
values:
def iter_equal(items1, items2):
'''`True` if iterators `items1` and `items2` contain equal items.'''
return (items1 is items2) or \
all(a == b for a, b in itertools.zip_longest(items1, items2, fillvalue=object()))
You still have to make sure the iterators do not have side effects on each other.
Upvotes: 3
Views: 1862
Reputation: 369244
How about using all
with generator expression:
lines_are_equal = all(a == b for a, b in itertools.zip_longest(lines1, lines2))
UPDATE If iterable can produce trailing None
, it's better to specify fillvalue=object()
as user2357112 commented. (by default None
is used for fill values)
lines_are_equal = all(a == b for a, b in
itertools.zip_longest(lines1, lines2, fillvalue=object()))
If your purporse is to compare two files, not any iterables, you can use filecmp.cmp
instead:
files_are_equal = filecmp.cmp(filename1, filename2)
Upvotes: 5