ProfHase85
ProfHase85

Reputation: 12183

parallel writing to list in python

I got multiple parallel processes writing into one list in python. My code is:

global_list = []
class MyThread(threading.Thread):
    ...
    def run(self):
    results = self.calculate_results()

    global_list.extend(results)


def total_results():
    for param in params:
         t = MyThread(param)
         t.start()
    while threading.active_count() > 1:
        pass
    return total_results

I don't like this aproach as it has:

  1. An overall global variable -> What would be the way to have a local variable for the `total_results function?
  2. The way I check when the list is returned seems somewhat clumsy, what would be the standard way?

Upvotes: 0

Views: 1389

Answers (2)

sebdelsol
sebdelsol

Reputation: 1085

1 - Use a class variable shared between all Worker's instances to append your results

from threading import Thread

class Worker(Thread):
    results = []
    ...

    def run(self):
        results = self.calculate_results()
        Worker.results.extend(results) # extending a list is thread safe

2 - Use join() to wait untill all the threads are done and let them have some computational time

def total_results(params):
    # create all workers
    workers = [Worker(p) for p in params]

    # start all workers
    [w.start() for w in workers]

    # wait for all of them to finish
    [w.join() for w in workers]

    #get the result
    return Worker.results

Upvotes: 1

John Zwinck
John Zwinck

Reputation: 249153

Is your computation CPU-intensive? If so you should look at the multiprocessing module which is included with Python and offers a fairly easy to use Pool class into which you can feed compute tasks and later get all the results. If you need a lot of CPU time this will be faster anyway, because Python doesn't do threading all that well: only a single interpreter thread can run at a time in one process. Multiprocessing sidesteps that (and offers the Pool abstraction which makes your job easier). Oh, and if you really want to stick with threads, multiprocessing has a ThreadPool too.

Upvotes: 2

Related Questions