Gonzalo Dambra
Gonzalo Dambra

Reputation: 980

Understanding some differences between using yield from generator comprehension

I have a scenario that behaves different when I implement a solution using a generator comprehension from when I use the yield keyword.

Here are the two examples:

Example A (this works):

def get_latest_products(self) -> Generator:
    first = True  # avoid appending the CSV header (first row)
    with open('path/to/my/file.csv')as file:
        file = get_csv_reader(file)
        for row in file:
            if not first:
                product = PageProduct(
                    page_name=row[0],
                    category_id=row[1],
                    product_id=row[2],
                    product_url=row[3],
                    product_name=row[4],
                    product_price=row[5],
                )
                yield product
            first = False

Example B (more elegant and would work if there was no I/O handling):

def get_latest_products(self) -> Generator:
    with open('path/to/my/file.csv') as file:
        file = get_csv_reader(file)
        return (
            PageProduct(
                page_name=row[0],
                category_id=row[1],
                product_id=row[2],
                product_url=row[3],
                product_name=row[4],
                product_price=row[5],
            ) for index, row in enumerate(file) if index > 0
        )

When the example B is implemented, which I think is more readable and elegant, I got: (when I call next())

  File "/Users/xxx/xxx/collect_products.py", line 157, in <genexpr>
    return (
ValueError: I/O operation on closed file.

While the example A implementation works fine. Why?

Upvotes: 1

Views: 49

Answers (1)

chepner
chepner

Reputation: 531165

You would need to define and use the generator inside the context of the with statement. The best way to do this is to have your method take an iterable (maybe a file handle, maybe something else) as an argument, instead of opening the file itself.

from itertools import islice


def get_latest_products(self, fh) -> Generator:
    f = get_csv_reader(fh)
    yield from (PageProduct(...) for row in islice(fh, 1, None))

The call the method from inside a with statement:

with open('path/to/my/file.csv', r) as f:
    for product in foo.get_latest_products(f):
        ...

This also makes testing much easier, as you can call get_latest_products with any iterable, rather than relying on a file in the file system.

Upvotes: 2

Related Questions