George
George

Reputation: 2093

Why is pool.map slower than normal map?

I'm trying the following code:

import multiprocessing
import time
import random

def square(x):
    return x**2

pool = multiprocessing.Pool(4)

l = [random.random() for i in xrange(10**8)]

now = time.time()
pool.map(square, l)
print time.time() - now

now = time.time()
map(square, l)
print time.time() - now

and the pool.map version consistently runs several seconds more slowly than the normal map version (19 seconds vs 14 seconds).

I've looked at the questions: Why is multiprocessing.Pool.map slower than builtin map? and multiprocessing.Pool() slower than just using ordinary functions and they seem to chalk it up to to either IPC overhead or disk saturation, but I feel like in my example those aren't obviously the issue; I'm not writing/reading anything to/from disk, and the computation is long enough that it seems like IPC overhead should be small compared to the total time saved by the multiprocessing (I'm estimating that, since I'm doing work on 4 cores instead of 1, I should cut the computation time down from 14 seconds to about 3.5 seconds). I'm not saturating my cpu I don't think; checking cat /proc/cpuinfo shows that I have 4 cores, but even when I multiprocess to only 2 processes it's still slower than just the normal map function (and even slower than 4 processes). What else could be slowing down the multiprocessed version? Am I misunderstanding how IPC overhead scales?

If it's relevant, this code is written in Python 2.7, and my OS is Linux Mint 17.2

Upvotes: 0

Views: 1373

Answers (1)

noxdafox
noxdafox

Reputation: 15040

pool.map splits a list into N jobs (where N is the size of the list) and dispatches those to the processes.

The work a single process is doing is shown in your code:

def square(x):
    return x**2

This operation takes very little time on modern CPUs, no matter how big the number is.

In your example you're creating a huge list and performing an irrelevant operation on every single element. Of course the IPC overhead will be greater compared to the regular map function which is optimized for fast looping.

In order to see your example working as you expect, just add a time.sleep(0.1) call to the square function. This simulates a long running task. Of course you might want to reduce the size of the list or it will take forever to complete.

Upvotes: 1

Related Questions