vkb
vkb

Reputation: 458

Python multiprocessing takes more time

I have server with 12 cores and 28GB RAM. I am running two versions of Python; one with multiprocessing and another sequential. I expect the Multiprocessing.py to finish early compared to Sequential.py but the multiprocessing code takes 5 times more (120sec) compared to sequential code (25sec)

Multiprocessing.py

import os,multiprocessing,time
def cube(x):
    print(x**3)
    return
if __name__ == '__main__':
    jobs = []
    start = time.time()
    for i in range(5000):
        p = multiprocessing.Process(target=cube(i))
        jobs.append(p)
        p.start()
    end = time.time()
    print end - start

Sequential.py

import os,time
def cube(x):
    print(x**3)
    return
if __name__ == '__main__':
    start = time.time()
    for i in range(5000):
        cube(i)
    end = time.time()
    print end - start

Can you please help?

Upvotes: 1

Views: 8168

Answers (3)

llcao
llcao

Reputation: 101

Would you like to try pool? For example, the following should work:

from multiprocessing import Pool

p = Pool(12)
Results = p.map(cube, range(5000))

Upvotes: 0

Raymond Hettinger
Raymond Hettinger

Reputation: 226664

The problem is that too little work is being done relative to the IPC communication overhead.

The cube function isn't a good candidate for multiprocessing speedup. Try something "more interesting" like function that computes the sum of cube for 1 to n or somesuch:

import os, multiprocessing, time

def sum_of_cubes(n):
    return sum(x**3 for x in range(n))

if __name__ == '__main__':

    from multiprocessing.pool import ThreadPool as Pool

    pool = Pool(25)

    start = time.time()
    print(pool.map(sum_of_cubes, range(1000, 100000, 1000)))
    end = time.time()
    print(end - start)

The general rules are:

  • don't start more pools than your cores can benefit from
  • don't pass in a lot of data or return a lot of data (too much IPC load)
  • do significant work in the process relative to the IPC overhead.

Upvotes: 7

Ezekiel Kruglick
Ezekiel Kruglick

Reputation: 4686

You shouldn't be starting a process for each multiplication. Start 12 processes and pass each one the numbers or hand out the numbers at process creation.

If you profile that I'm fairly certain you'll find all your time spent in process creation and clean up.

ALSO: I've done testing on how many processes to run vs core count and the optimum depends on architecture (e.g. some intel chips have 2x threads per core) and operating system (Linux seems to handle it better than Windows). If you're on windows I'd advise trying process counts of 0.8-2.2x core count. On Linux you can do more.

Upvotes: 0

Related Questions