cefn
cefn

Reputation: 3341

yield-based equivalent to Python3 'yield from' delegation without losing send

I don't know how to use yield (not yield from) to wrap a subgenerator without preventing send() from working. Using yield from allows send() to keep working with a subgenerator, but it hands over control and you can't inspect or count values coming through.

Motivation: I authored some stream iteration using python3 generators, allowing files or sockets or whatever to be read a byte at a time through a common 'interface' for parsers to consume the bytes one by one.

To help with parsing I then extended the generator logic so the allow the requester can indicate if the stream should increment its position after yielding the byte (the byte is read AND consumed) or if the stream should keep its position after yielding (a peek - the byte is ONLY read). This is so rules can be matched, before streams are handed on to sub-parsers).

In the current implementation, either of the following will get a byte from the generator AND increment the generator's position in the stream.

byte = stream.send(True)

byte = next(stream)

...while this special invocation gets a byte from a generator WITHOUT incrementing the position in the stream.

byte = stream.send(False)

So far so good. My low-memory JSON parser ( https://github.com/ShrimpingIt/medea ) is working well. The example at https://github.com/ShrimpingIt/medea/blob/dd0007e657cd487913c72993dcdaf0f60d8ee30e/examples/scripts/twitterValuesNamed.py is able to process cached tweets from a file.

In the case of HTTPS - to get live tweets from a socket, the SSL socket doesn't automatically close at the end. Any parsing procedure just hangs waiting for more data, which I would like to fix.

For this reason I first create the HTTPS stream, then read bytes from the stream, process the content-length header and skip to the "\r\n\r\n" at the end of the HTTP headers before handing the stream at the right position over to the parser. At this point I know how many content bytes the stream should now serve before stopping.

Unfortunately I then hit a syntax, expressivity or comprehension problem.

Handing over to the stream directly is easy, and the send functionality (allowing to peek bytes) is preserved...

def delegatingStream():
    yield from rawStream

However, I can't figure how to create an iterator which intelligently uses the contentLength value to terminate after contentLength without breaking send().

I need to use yield to be able to intervene in iterator logic, but only yield from seems to permit send() to be delegated. I can't even get send and yield together to replicate the behaviour of delegatingStream(). This doesn't have the same effect, for example.

def relayingStream():
    while True:
        yield rawStream.send((yield))

The reason for avoiding yield from is that I need to have an implementation like this in the end, (this doesn't faithfully forward send() either)...

def terminatingStream():
    contentPos = 0
    while contentPos < contentLength:
        increment = (yield)
        yield rawStream.send(increment)
        if increment is not False:
            contentPos += 1
    rawStream.throw(StopIteration)

Any idea how I can correctly forward the values from send(), without using yield from ?

Upvotes: 2

Views: 303

Answers (1)

Martijn Pieters
Martijn Pieters

Reputation: 1123830

You can consult PEP 380 -- Syntax for Delegating to a Subgenerator, to see the yield from Python equivalent, in the Formal Semantics section.

RESULT = yield from EXPR is essentially equivalent to:

_i = iter(EXPR)
try:
    _y = next(_i)
except StopIteration as _e:
    _r = _e.value
else:
    while 1:
        try:
            _s = yield _y
        except GeneratorExit as _e:
            try:
                _m = _i.close
            except AttributeError:
                pass
            else:
                _m()
            raise _e
        except BaseException as _e:
            _x = sys.exc_info()
            try:
                _m = _i.throw
            except AttributeError:
                raise _e
            else:
                try:
                    _y = _m(*_x)
                except StopIteration as _e:
                    _r = _e.value
                    break
        else:
            try:
                if _s is None:
                    _y = next(_i)
                else:
                    _y = _i.send(_s)
            except StopIteration as _e:
                _r = _e.value
                break
RESULT = _r

You can implement exactly that to replace yield from. You can probably drop the RESULT handling since you don't expect any. Next is the careful handling of generator.close() and generator.throw(); if you assume these exist then you can simplify further, and use some more readable names:

it = iter(EXPR)
try:
    value = next(it)
except StopIteration:
    pass
else:
    while True:
        try:
            sent = yield value
        except GeneratorExit:
            it.close()
            raise
        except BaseException:
            try:
                value = it.throw(*sys.exc_info())
            except StopIteration:
                break
        else:
            try:
                value = it.send(sent)
            except StopIteration:
                break

I also used the fact that generator.next() is the moral equivalent of generator.send(None) to remove another test.

Replace EXPR with rawStream and wrap the whole into a function and you can monitor the flow of data in both directions.

Upvotes: 5

Related Questions