Jose
Jose

Reputation: 340

for loop not executing asynchronously in parallel code

I have the following function:

with ProcessPoolExecutor(max_workers=None) as executor:
            futures = [executor.submit(execute, row) for row in fetch_row()]
            for future in as_completed(futures):
                # do something

According to my understanding futures should be populated asynchronously while the for loop executes the futures as they become available.

I've written a print statement right before the return in execute to make sure that the function is indeed about to return.

It seems that the function is indeed submitting jobs in parallel, and they are returning, but the for loop is not executing as the jobs are returning from their respective process, but only after the list is fully built.

I'd like to execute the for at the same time the futures are coming in.

fetch_row() is a generator that just reads from a CSV file.

Upvotes: 2

Views: 82

Answers (1)

MarianD
MarianD

Reputation: 14121

Instead of creating a list

futures = [executor.submit(execute, row) for row in fetch_row()]

create a generator (parentheses instead of brackets):

futures = (executor.submit(execute, row) for row in fetch_row())

Upvotes: 1

Related Questions