sapi
sapi

Reputation: 10224

Is it good practice to yield from within a context manager?

I recently wrote a method which returned a sequence of open files; in other words, something like this:

# this is very much simplified, of course
# the actual code returns file-like objects, not necessarily files
def _iterdir(self, *path):
    dr = os.path.join(*path)
    paths = imap(lambda fn: os.path.join(dr, fn), os.listdir(dr))

    return imap(open, paths)

Syntactically, I do not expect to have to close the resulting objects if I do something like:

for f in _iterdir('/', 'usr'):
    make_unicorns_from(f)
    # ! f.close()

As a result, I decided to wrap _iterdir in a context manager:

def iterdir(self, *path):
    it = self._iterdir(*path)

    while 1:
        with it.next() as f:
            yield f

This appears to be working correctly.

What I'm interested in is whether doing this is good practice. Will I run into any issues following this pattern (perhaps if exceptions are thrown)?

Upvotes: 15

Views: 8047

Answers (2)

Veedrac
Veedrac

Reputation: 60137

The whole point of with involves unifying opening and closing with exception safety and explicit lifetimes. Your abstraction removes some of that, but not all.

Here's a totally-simplified example:

def with_open():
    with open(...) as f:
        yield f

Consider an exception in its usage:

for _ in with_open():
    raise NotImplementedError

This will not terminate the loop, and so the file will be left open. Possibly forever.

Consider incomplete, non-exception based exits, too:

for _ in with_open():
    break

for _ in with_open():
    return

next(with_open())

One option is to return a context manager itself, so that you can do:

def with_open():
    yield partial(open, ...)

for filecontext in with_open():
    with filecontext() as f:
        break

Another, more direct solution, would be to define the function as

from contextlib import closing

def with_open(self, *path):
    def inner():
        for file in self._iterdir(*path):
            with file:
                yield file

    return closing(inner())

and use it as

with iterdir() as files:
    for file in files:
        ...

This guarantees closure without having to move the opening of files to the caller.

Upvotes: 9

user2357112
user2357112

Reputation: 280456

There are two problems I see. One is that if you try to use more than one file at a time, things break:

list(iterdir('/', 'usr')) # Doesn't work; they're all closed.

The second is unlikely to happen in CPython, but if you have a reference cycle, or if your code is ever run on a different Python implementation, the problem can manifest.

If an exception happens in make_unicorns_from(f):

for f in iterdir('/', 'usr'):
    make_unicorns_from(f) # Uh oh, not enough biomass.

The file you were using won't be closed until the generator is garbage-collected. At that point, the generator's close method will be called, throwing a GeneratorExit exception at the point of the last yield, and the exception will cause the context manager to close the file.

With CPython's reference counting, this usually happens immediately. However, on a non-reference-counted implementation or in the presence of a reference cycle, the generator might not be collected until a cycle-detecting GC pass is run. This could take a while.


My gut says to leave closing the files to the caller. You can do

for f in _iterdir('/', 'usr'):
    with f:
        make_unicorns_from(f)

and they'll all be closed promptly, even without a with in the generator, and even if an exception is thrown. I don't know whether or not this is actually a better idea than having the generator take charge of closing the files.

Upvotes: 8

Related Questions