Reputation: 10224
I recently wrote a method which returned a sequence of open files; in other words, something like this:
# this is very much simplified, of course
# the actual code returns file-like objects, not necessarily files
def _iterdir(self, *path):
dr = os.path.join(*path)
paths = imap(lambda fn: os.path.join(dr, fn), os.listdir(dr))
return imap(open, paths)
Syntactically, I do not expect to have to close the resulting objects if I do something like:
for f in _iterdir('/', 'usr'):
make_unicorns_from(f)
# ! f.close()
As a result, I decided to wrap _iterdir
in a context manager:
def iterdir(self, *path):
it = self._iterdir(*path)
while 1:
with it.next() as f:
yield f
This appears to be working correctly.
What I'm interested in is whether doing this is good practice. Will I run into any issues following this pattern (perhaps if exceptions are thrown)?
Upvotes: 15
Views: 8047
Reputation: 60137
The whole point of with
involves unifying opening and closing with exception safety and explicit lifetimes. Your abstraction removes some of that, but not all.
Here's a totally-simplified example:
def with_open():
with open(...) as f:
yield f
Consider an exception in its usage:
for _ in with_open():
raise NotImplementedError
This will not terminate the loop, and so the file will be left open. Possibly forever.
Consider incomplete, non-exception based exits, too:
for _ in with_open():
break
for _ in with_open():
return
next(with_open())
One option is to return a context manager itself, so that you can do:
def with_open():
yield partial(open, ...)
for filecontext in with_open():
with filecontext() as f:
break
Another, more direct solution, would be to define the function as
from contextlib import closing
def with_open(self, *path):
def inner():
for file in self._iterdir(*path):
with file:
yield file
return closing(inner())
and use it as
with iterdir() as files:
for file in files:
...
This guarantees closure without having to move the opening of files to the caller.
Upvotes: 9
Reputation: 280456
There are two problems I see. One is that if you try to use more than one file at a time, things break:
list(iterdir('/', 'usr')) # Doesn't work; they're all closed.
The second is unlikely to happen in CPython, but if you have a reference cycle, or if your code is ever run on a different Python implementation, the problem can manifest.
If an exception happens in make_unicorns_from(f)
:
for f in iterdir('/', 'usr'):
make_unicorns_from(f) # Uh oh, not enough biomass.
The file you were using won't be closed until the generator is garbage-collected. At that point, the generator's close
method will be called, throwing a GeneratorExit
exception at the point of the last yield
, and the exception will cause the context manager to close the file.
With CPython's reference counting, this usually happens immediately. However, on a non-reference-counted implementation or in the presence of a reference cycle, the generator might not be collected until a cycle-detecting GC pass is run. This could take a while.
My gut says to leave closing the files to the caller. You can do
for f in _iterdir('/', 'usr'):
with f:
make_unicorns_from(f)
and they'll all be closed promptly, even without a with
in the generator, and even if an exception is thrown. I don't know whether or not this is actually a better idea than having the generator take charge of closing the files.
Upvotes: 8