Sanjay Chintha
Sanjay Chintha

Reputation: 326

Python multi processing on for loop

I have a function with two parameters

reqs =[1223,1456,1243,20455]
url = "pass a url"
def crawl(i,url):
   print("%s is %s" % (i, url))

I want to trigger above function by multi processing concept.

from multiprocessing import Pool

if __name__ == '__main__':
    p = Pool(5)   
    print(p.map([crawl(i,url) for i in reqs]))

above code is not working for me. can anyone please help me on this!

----- ADDING NEW CODE ---------

from multiprocessing import Pool

reqs = [1223,1456,1243,20455]
url = "pass a url"

def crawl(combined_args):
   print("%s is %s" % (combined_args[0], combined_args[1]))

def main():
    p = Pool(5)   
    print(p.map(crawl, [(i,url) for i in reqs]))

if __name__ == '__main__':
    main()

when I am trying to execute above code, I am getting below error

enter image description here

Upvotes: 1

Views: 288

Answers (2)

Sanjay Chintha
Sanjay Chintha

Reputation: 326

Issue resolved. crawl function should in separate module like below:

crawler.py

def crawl(combined_args):
   print("%s is %s" % (combined_args[0], combined_args[1]))

run.py

from multiprocessing import Pool
import crawler

def main():
    p = Pool(5)   
    print(p.map(crawler.crawl, [(i,url) for i in reqs]))

if __name__ == '__main__':
    main()

Then output will be like below:

**output :**

1223 is pass a url
1456 is pass a url
1243 is pass a url
20455 is pass a url
[None, None, None, None]  # This is the output of the map function

Upvotes: 1

Amiram
Amiram

Reputation: 1305

According to the multiprocessing.Pool.map this is the function argument line:

map(func, iterable[, chunksize])

You are trying to pass to the map a iterator instead of (func, iterable).

Please refer to the following example of multiprocessing.pool (source):

import time
from multiprocessing import Pool

work = (["A", 5], ["B", 2], ["C", 1], ["D", 3])

def work_log(work_data):
    print(" Process %s waiting %s seconds" % (work_data[0], work_data[1]))
    time.sleep(int(work_data[1]))
    print(" Process %s Finished." % work_data[0])

def pool_handler():
    p = Pool(2)
    p.map(work_log, work)


if __name__ == '__main__':
    pool_handler()

Please note that he is passing one argument to the work_log function and in the function he use the index to get to the relevant fields.


Refering to your example:

from multiprocessing import Pool

reqs = [1223,1456,1243,20455]
url = "pass a url"

def crawl(combined_args):
   print("%s is %s" % (combined_args[0], combined_args[1]))

def main():
    p = Pool(5)   
    print(p.map(crawl, [(i,url) for i in reqs]))

if __name__ == '__main__':
    main()

Results with:

1223 is pass a url
1456 is pass a url
1243 is pass a url
20455 is pass a url
[None, None, None, None]  # This is the output of the map function

Upvotes: 2

Related Questions