kezzos
kezzos

Reputation: 3221

Why is get() slow in multiprocessing?

I have a basic multiprocessing class which takes some parameters and sends them off to a worker:

class Multi(object):
    def __init__(self, pool_parameters, pool_size):
        self.pool_parameters = pool_parameters  # Parameters in a tuple
        self.pool_size = pool_size
        self.pool = mp.Pool(self.pool_size)
        self.results = \
            [self.pool.apply_async(worker, args=((self.pool_parameters[i]),),)
                for i in range(self.pool_size)]
        time1 = time.time()
        self.output = [r.get() for r in self.results]  # Output objects in here
        print time.time() - time1

def worker(*args):
    # Do stuff
    return stuff

However the r.get() line seems to take ages. If I have a pool_size of 1, the worker returns its result in 0.1 seconds, but the r.get() line takes another 1.35 seconds. Why does it take so long, especially if only one process is started?

EDIT: For a single process and using the worker to return a single None value, the self.output line still takes 1.3 seconds on my system (using time.time() to time that line)

EDIT2: Sorry, I found the problem and I dont think it is to do with multiprocessing. The problem seems to come from importing various other modules. When I got rid my imports the time was 0.1 seconds. No idea why though...

Upvotes: 3

Views: 1900

Answers (1)

dano
dano

Reputation: 94951

You're seeing poor performance because you're sending a large object between the processes. Pickling the object in the child, sending those bytes between processes, and then unpickling them in parent, takes a non-trivial amount of time. This is one of the reasons the best practices for multiprocessing suggests avoiding large amounts of shared state:

Avoid shared state

As far as possible one should try to avoid shifting large amounts of data between processes.

You'll probably be able to isolate this behavior if you call pickle.loads(pickle.dumps(obj)) on your object. I would expect it to take almost as long as the get() call.

Upvotes: 4

Related Questions