Freeing a generators resources after iterating only over a part of the sequence

I would like to parse a string line by line and provide a generator for the results of each parse. The code that iterates over these results may choose not to iterate over the full sequence if it finds the information it wants:

import StringIO

def foo(string):
  sstream = StringIO.StringIO(string)
  for line in sstream:
    res = doSomethingWith(line)
    yield res
  sstream.close()

for bar in foo(mystring):
  if condition(bar):
     break

I presume, that this will leave sstream open if condition(bar) becomes True. What is the best way to guarantee that sstream will be closed when we're finished iterating over foo()? Will I have to wrap the generator in a class definition and implement __del__? Or can I rely on garbage collection here? I plan to call foo() for a lot of different strings.

Upvotes: 2

Views: 146

Answers (2)

Aya
Aya

Reputation: 41950

What is the best way to guarantee that sstream will be closed when we're finished iterating over foo()?

In the general case of a 'cleanup' function that absolutely has to be called, you'll probably have to call it outside of the generator with something like...

from StringIO import StringIO

def foo(sstream):
    for line in sstream:
        res = doSomethingWith(line)
        yield res

sio = StringIO(mystring)
try:
    for bar in foo(sio):
        if condition(bar):
            break
finally:
    sio.close()

Context managers don't seem to work inside generators unless they're exhausted. For example...

from StringIO import StringIO
from contextlib import contextmanager

@contextmanager
def my_stringio(s):
    print 'creating StringIO'
    sio = StringIO(s)
    yield sio
    print 'calling close()'
    sio.close()

def mygen():
    with my_stringio('abcdefghij') as sio:
        while 1:
            char = sio.read(1)
            if not char:
                break
            yield char

for char in mygen():
    print char
    if char == 'c':
        break

...never prints 'calling close()'.

Will I have to wrap the generator in a class definition and implement __del__?

That's another option, but the problem with that approach is that if you somehow manage to create a circular reference with a class instance, the __del__ method will never get called.

Or can I rely on garbage collection here?

In this case, you can.

With a StringIO it doesn't really matter if you call the close() method. The only thing you might want to ensure is that the memory it was using has been garbage-collected, which will happen regardless of the way your for loop terminates - the generator will go out of scope, and its locals will be GC'd.

Upvotes: 2

Brian
Brian

Reputation: 555

EDIT: Nevermind the broken nonsense below; as far as I know you would need to perform the break in the for loop where the yield is located.

Might something like this work? I could easily be overlooking something.

import StringIO

# perform the break on the inner forloop first, to ensure sstream gets closed
break_ = false
def foo(string, break_):
  sstream = StringIO.StringIO(string)
  for line in sstream:
    res = doSomethingWith(line)
    if not break_: yield res
    else: break
  sstream.close()

for bar in foo(mystring, break_):
  if break_:
      break
  elif condition(bar):
     break_ = True

Upvotes: 1

Related Questions