Multi threading read and write file using python

Question

So, I have a 9000 lines csv file. I have read it and stored it in a dictionary list with string key, m. What I want to do is to loop for every item list[m] and process it into a function processItem(item). This processItem will return a string with csv-like format. My aim is to write the result of processItem function for every item in list. Is there any idea how to do this multi thread way?

I think I should divide the list to N sub-lists and then process these sub-lists in multi thread way. Every thread will return the string processed from sub-lists and then merge it. Finally write it to a file. How to implement that?

Delgan · Accepted Answer

This is a perfect example for using the multiprocessing module and the Pool() function (be aware that threading module can not be used for speed).

You have to apply a function on each element of your list, so this can be easily parallelized.

with Pool() as p:
    processed = p.map(processItem, lst)

If you are using Python 2, Pool() cannot be used as a context manager, but you can use it like this:

p = Pool()
processed = p.map(processItem, lst)

Your function processItem() will be call for each element in your lst, and the result will create a new list processed (order is preserved).

The function Pool() spawn as many process workers that your CPU has cores, and it executes new task as soon as the previous one is finished, until every elements has been processed.

Multi threading read and write file using python

Answers (1)

Related Questions