Reputation: 12331
Im trying to find out how multiprocessing works in Python. The following example is what I made:
import requests
from multiprocessing import Process
import time
def f(name):
print 'hello', name
time.sleep(15)
print 'ended', name
if __name__ == '__main__':
urls = [
'http://python-requests.org',
'http://httpbin.org',
'http://python-guide.org'
]
for url in urls:
p = Process(target=f, args=(url,))
p.start()
p.join()
print("finished")
What I tried to simulate in f is a request to a URL that has a timeout of 15 seconds. What I expected to happen is that all the request would start at almost the same time and finish at the same time. But what actually happens is they all start one after each other and wait till the previous one is finished. So the result is:
hello http://python-requests.org
ended http://python-requests.org
hello http://httpbin.org
ended http://httpbin.org
hello http://python-guide.org
ended http://python-guide.org
So what actually happens? why would one use the code above instead of just doing:
for url in urls:
f(url)
Upvotes: 2
Views: 3238
Reputation: 140307
the problem is your loop:
for url in urls:
p = Process(target=f, args=(url,))
p.start()
p.join()
you're starting the process then you wait for it to complete, then you start the next one ...
Instead, create your process list, start them, and wait for them:
pl = [Process(target=f, args=(url,)) for url in urls]
for p in pl:
p.start()
for p in pl:
p.join()
note that in that case, using Process
is probably overkill, since threads would do the job very well (no massive python computing involved, only system calls & networking)
To switch to threads, just use multiprocessing.dummy
instead so your program structure remains the same.
import multiprocessing.dummy as multiprocessing
Upvotes: 4
Reputation: 5776
You only spawn one process. Thus, the process (a unique worker) takes the first input, runs f, timeouts during 15 sec, quit f; and then takes the second input. c.f. doc
You could try to map your function f with the inputs. In the example below, you spawn 2 processes (2 workers).
import multiprocessing as mp
if __name__ == '__main__':
with mp.Pool(processes = 2) as p:
p.map(f, urls)
Upvotes: 1