urllib2 urlopen will be block in multiprocessing

Question

I want make use of multiprocessing to speed report generator for every company

following is test script:

from multiprocessing import Pool
import os, time, random, json, urllib, urllib2, uuid

def generate_report(url, cookie, company_id, period, remark):
    try:
        start = time.time()
        print('Run task %s (%s)... at: %s 
' % (company_id, os.getpid(), start))

        values = {
            'companies': json.dumps([company_id]),
            'month_year': period,
            'remark': remark
        }

        data = urllib.urlencode(values)

        headers = {
            'Cookie': cookie
        }
        url = "%s?pid=%s&uuid=%s" % (url, os.getpid(), uuid.uuid4().get_hex())
        request = urllib2.Request(url, data, headers)
        response = urllib2.urlopen(request)
        content = response.read()
        end = time.time()
        print 'Task %s runs %0.2f seconds, end at: %s 
' % (company_id, (end - start), end)
        return content
    except Exception as exc:
        return exc.message

if __name__=='__main__':
    print 'Parent process %s.
' % os.getpid()
    p = Pool()

    url = 'http://localhost/fee_calculate/generate-single'
    cookie = 'xxx'
    company_ids = [17,15,21,19]
    period = '2017-08'
    remark = 'test add remark from python script'

    results = [p.apply_async(generate_report, args=(url,cookie,company_id,period,remark)) for company_id in company_ids]
    for r in results:
        print(r.get())

but I get the result as following:

Run task 17 (15952)... at: 1506568581.98
Run task 15 (17192)... at: 1506568581.99
Run task 21 (18116)... at: 1506568582.01
Run task 19 (1708)... at: 1506568582.05

Task 17 runs 13.50 seconds, end at: 1506568595.48

{"success":true,"info":"Successed!"}
Task 15 runs 23.60 seconds, end at: 1506568605.59

{"success":true,"info":"Successed!"}
Task 21 runs 34.35 seconds, end at: 1506568616.36

{"success":true,"info":"Successed!"}
Task 19 runs 44.38 seconds, end at: 1506568626.44

{"success":true,"info":"Successed!"}

it seems the urllib2.urlopen(request) has been blocked, the request not been sent parallelly, but orderly.

In order to test multiprocessing, the script fee_calculate/generate-single only has following important code:

sleep(10)

please give me advice, thanks.

PS: Platform: windows10, python2.7, 4 CPU

urllib2 urlopen will be block in multiprocessing

Answers (1)

Related Questions