Reputation: 18790
I am working on Ubuntu 12 with 8 CPU3 as reported by the System monitor.
the testing code is
import multiprocessing as mp
def square(x):
return x**2
if __name__ == '__main__':
pool=mp.Pool(processes=4)
pool.map(square,range(100000000))
pool.close()
# for i in range(100000000):
# square(i)
The problem is:
1) All workload seems to be scheduled to just one core, which gets close to 100% utilization, despite the fact that several processes are started. Occasionally all workload migrates to another core but the workload is never distributed among them.
2) without multiprocessing is faster
for i in range(100000000):
square(i)
I have read the similar questions on stackoverflow like: Python multiprocessing utilizes only one core
still got no applied result.
Upvotes: 0
Views: 479
Reputation: 43234
The function you are using is way too short (i.e. doesn't take enough time to compute), so you spend all your time in the synchronization between processes, that has to be done in a serial manner (so why not on a single processor). Try this:
import multiprocessing as mp
def square(x):
for i in range(10000):
j = i**2
return x**2
if __name__ == '__main__':
# pool=mp.Pool(processes=4)
# pool.map(square,range(1000))
# pool.close()
for i in range(1000):
square(i)
You will see that suddenly the multiprocessing works well: it takes ~2.5 seconds to accomplish, while it will take 10s without it.
Note: If using python 2, you might want to replace all the range
by xrange
Edit: I replaced time.sleep
by a CPU-intensive but useless calculation
Addendum: In general, for multi-CPU applications, you should try to make each CPU do as much work as possible without returning to the same process. In a case like yours, this means splitting the range into almost-equal sized lists, one per CPU and send them to the various CPUs.
Upvotes: 2
Reputation: 31474
When you do:
pool.map(square, range(100000000))
Before invoking the map
function, it has to create a list with 100000000 elements, and this is done by a single process, That's why you see a single core working.
Use a generator instead, so each core can pop a number out of it and you should see the speedup:
pool.map(square, xrange(100000000))
Upvotes: 1
Reputation: 37003
It isn't sufficient simply to import the multiprocessing library to make use of multiple processes to schedule your work. You actually have to create processes too!
Your work is currently scheduled to a single core because you haven't done so, and so your program is a single process with a single thread.
Naturally, when you start a new process to simply square a number, you are going to get slower performance. The overhead of process creation makes sure of that. So your process pool will very likely take longer than a singe-process run.
Upvotes: 0