Multiprocessing takes longer?

Question

I am trying to understand how to get children to write to a parent's variables. Maybe I'm doing something wrong here, but I would have imagined that multiprocessing would have taken a fraction of the time that it is actually taking:

import multiprocessing, time

def h(x):
    h.q.put('Doing: ' + str(x))
    return x

def f_init(q):
    h.q = q

def main():
    q = multiprocessing.Queue()
    p = multiprocessing.Pool(None, f_init, [q])
    results = p.imap(h, range(1,5))
    p.close()

-----Results-----:

1
2
3
4
Multiprocessed: 0.0695610046387 seconds
1
2
3
4
Normal: 2.78949737549e-05 seconds # much shorter

    for i in range(len(range(1,5))):
        print results.next() # prints 1, 4, 9, 16

if __name__ == '__main__':
    start = time.time()
    main()
    print "Multiprocessed: %s seconds" % (time.time()-start)             

    start = time.time()
    for i in range(1,5):
        print i
    print "Normal: %s seconds" % (time.time()-start)

steveha · Accepted Answer

@Blender basically already answered your question, but as a comment. There is some overhead associated with the multiprocessing machinery, so if you incur the overhead without doing any significant work, it will be slower.

Try actually doing some work that parallelizes well. For example, write Python code to open a file, scan it using a regular expression, and pull out matching lines; then make a list of ten big files and time how long it takes to do all ten with multiprocessing vs. plain Python. Or write code to compute an expensive function and try that.

I have used multiprocessing.Pool() just to run a bunch of instances of an external program. I used subprocess to run an audio encoder, and it ran four instances of the encoder at once for a noticeable speedup.

Multiprocessing takes longer?

Answers (1)

Related Questions