Reputation: 591
As zip
yields as many values as the shortest iterable given, I would have expected passing zero arguments to zip
to return an iterable yielding infinitely many tuples, instead of returning an empty iterable. After all, inf ∅ is infinite, not zero.
This would have also been consistent with how other monoidal operations behave:
>>> sum([]) # sum
0
>>> math.prod([]) # product
1
>>> all([]) # logical conjunction
True
>>> any([]) # logical disjunction
False
>>> list(itertools.product()) # Cartesian product
[()]
For each of these operations, the value returned when given no arguments the identity value for the operation, which is to say, one that does not modify the result when included in the operation:
sum(xs) == sum([*xs, 0]) == sum([*xs, sum()])
math.prod(xs) == math.prod([*xs, 1]) == math.prod([*xs, math.prod()])
all(xs) == all([*xs, True]) == all([*xs, all()])
any(xs) == any([*xs, False]) == any([*xs, any()])
Or at least, one that gives a trivially isomorphic result:
itertools.product(*xs, itertools.product())
≡ itertools.product(*xs, [()])
≡ (*x, ()) for x in itertools.product(*xs)
In the case of zip
, this would have been:
zip(*xs, zip())
≡f(x) for x in zip(*xs)
for some function f
. Because zip
returns an n-tuple when given n arguments, it follows that zip()
with 0 arguments must yield 0-tuples, i.e. ()
. This forces f
to return (*x, ())
and therefore zip()
to be equivalent to itertools.repeat(())
. Another, more general law is:
((*x, *y) for x, y in zip(zip(*xs), zip(*ys))
≡zip(*xs, *ys)
which would have then held for all xs
and ys
, including when either xs
or ys
is empty (and does hold for itertools.product
).
Yielding empty tuples indefinitely is also the behaviour that falls out of this straightforward reimplementation:
def my_zip(*iterables):
iterables = tuple(map(iter, iterables))
while True:
item = []
for it in iterables:
try:
item.append(next(it))
except StopIteration:
return
yield tuple(item)
which means that the case of zip
with no arguments must have been specifically special-cased not to do that.
Why is zip()
not equivalent to itertools.repeat(())
despite all the above?
Upvotes: 0
Views: 287
Reputation: 9858
PEP 201 and related discussion show that zip()
with no arguments originally raised an exception. It was changed to return an empty list because this is more convenient for some cases of zip(*s)
where s
turns out to be an empty list. No consideration was given to what might be the 'identity', which in any case appears difficult to define with respect to zip - there is nothing you can zip with arbitrary x
that will return x
.
The original reasons for certain commutative and associative mathematical functions applied to an empty list to return the identity by default are not clear, but may have been driven by convenience, principle of least astonishment, and the history of earlier languages like Perl or ABC. Explicit reference to the concept of mathematical identity is rarely if ever made (see e.g. Reason for "all" and "any" result on empty lists). So there is no reason to rely on functions in general to do this. In many cases it would be less surprising for them to raise an exception instead.
Upvotes: 2