Why append list is slower in Multiprocessing

Question

During testing I find out in the following, MP method run a bit slower

def eat_time(j):
    result = []
    for j in range(10**4):
        a = 0
        for i in range(1000):
            a += 101
            result.append(a)
    return result

if __name__ == '__main__':
    #MP method
    t = time.time()
    pool = Pool()
    result = []
    data = pool.map(eat_time, [i for i in range(5)])
    for d in data:
        result += d
    print(time.time()-t) #11s for my computer

    #Normal method
    t = time.time()
    integers = []
    for i in range(5):
        integers += eat_time(i)
    print(time.time()-t) #8s for my computer

However, if I don't require it to aggregate the data by changing eat_time() to

def eat_time(j):
    result = []
    for j in range(10**4):
        a = 0
        for i in range(1000):
            a += 101
            #result.append(a)
    return result

The MP time is much faster and now for my computer just run 3s, while normal method still take 8s. (As expected)

It looks strange to me as result is declared individually in method, I don't expect appending completely ruin the MP.

May I know is there a correct way to do this? And why MP is slower when append involved?

Edited for comment

Thx for @torek and @akhavro clarify the point.

Yes, I understand creating process take times, that's why the problem raised.

Actually the original code put the for-loop outside and call the simple method again and again, it is a bit faster over normal method in significantly many task (my case more than 10**6 calls).

Therefore I change to put code inside and make the method a bit more complicated. By moving for j in range(10**4): this line into eat_time().

But it seems making the code complicated causes communication lag due to larger data size.

So, probably the answer is no way to solve it.

Hannu · Accepted Answer

It is not append that causes your slowness but returning the result with appended elements. You can test it by changing your code to do the append but return only the first few elements of your result. Now it should work much faster again.

When you return your result from a Pool worker, this is in practice implemented as a queue from multiprocessing. It works but it is not a miracle performer, definitely much slower than just manipulating in-memory structures. When you return a lot of data, the queue needs to transmit a lot.

There is no easy workaround. You could try shared memory but I do not personally like it due to added complexity. The better way would be to redesign your application so that it does not need to transmit a lot of data between processes. For example, would it be possible to process data in your worker further so that you do not need to return it all but only a processed subset?

Why append list is slower in Multiprocessing

Answers (1)

Related Questions