Sven van den Boogaart
Sven van den Boogaart

Reputation: 12331

Python multiprocessing wait for sleep

Im trying to find out how multiprocessing works in Python. The following example is what I made:

import requests
from multiprocessing import Process
import time

def f(name):
    print 'hello', name
    time.sleep(15)
    print 'ended', name


if __name__ == '__main__':
    urls = [
        'http://python-requests.org',
        'http://httpbin.org',
        'http://python-guide.org'
    ]

    for url in urls:
        p = Process(target=f, args=(url,))
        p.start()
        p.join()
    print("finished")

What I tried to simulate in f is a request to a URL that has a timeout of 15 seconds. What I expected to happen is that all the request would start at almost the same time and finish at the same time. But what actually happens is they all start one after each other and wait till the previous one is finished. So the result is:

hello http://python-requests.org

ended http://python-requests.org

hello http://httpbin.org

ended http://httpbin.org

hello http://python-guide.org

ended http://python-guide.org

So what actually happens? why would one use the code above instead of just doing:

    for url in urls:
        f(url)

Upvotes: 2

Views: 3238

Answers (2)

Jean-François Fabre
Jean-François Fabre

Reputation: 140307

the problem is your loop:

for url in urls:
    p = Process(target=f, args=(url,))
    p.start()
    p.join()

you're starting the process then you wait for it to complete, then you start the next one ...

Instead, create your process list, start them, and wait for them:

pl = [Process(target=f, args=(url,)) for url in urls]
for p in pl:
   p.start()
for p in pl:
   p.join()

note that in that case, using Process is probably overkill, since threads would do the job very well (no massive python computing involved, only system calls & networking)

To switch to threads, just use multiprocessing.dummy instead so your program structure remains the same.

import multiprocessing.dummy as multiprocessing

Upvotes: 4

Mathieu
Mathieu

Reputation: 5776

You only spawn one process. Thus, the process (a unique worker) takes the first input, runs f, timeouts during 15 sec, quit f; and then takes the second input. c.f. doc

You could try to map your function f with the inputs. In the example below, you spawn 2 processes (2 workers).

import multiprocessing as mp

if __name__ == '__main__':
    with mp.Pool(processes = 2) as p:
        p.map(f, urls)

Upvotes: 1

Related Questions