Reputation: 747
So, I have a 9000 lines csv file. I have read it and stored it in a dictionary list with string key, m. What I want to do is to loop for every item list[m]
and process it into a function processItem(item)
. This processItem
will return a string with csv-like format. My aim is to write the result of processItem
function for every item in list. Is there any idea how to do this multi thread way?
I think I should divide the list to N sub-lists and then process these sub-lists in multi thread way. Every thread will return the string processed from sub-lists and then merge it. Finally write it to a file. How to implement that?
Upvotes: 1
Views: 2144
Reputation: 19717
This is a perfect example for using the multiprocessing
module and the Pool()
function (be aware that threading
module can not be used for speed).
You have to apply a function on each element of your list, so this can be easily parallelized.
with Pool() as p:
processed = p.map(processItem, lst)
If you are using Python 2, Pool()
cannot be used as a context manager, but you can use it like this:
p = Pool()
processed = p.map(processItem, lst)
Your function processItem()
will be call for each element in your lst
, and the result will create a new list processed
(order is preserved).
The function Pool()
spawn as many process workers that your CPU has cores, and it executes new task as soon as the previous one is finished, until every elements has been processed.
Upvotes: 5