kich
kich

Reputation: 764

how to use multiprocessing/multithreading on my code?

I am using the below code for a dictionary of like 100,000 keys and values...I wanted to make it more faster by doing multiprocessing/multithreading since each loop is independent of another loop. Can anyone tell me how to apply and which one (multiprocessing/multithreading) is more apt for this kind of approach

from urlparse import urlparse

ProcessAllURLs(URLs)

ProcessAllURLs(URLs)
def ProcessAllURLs(URLs):
    for eachurl in URLs:
            x=urlparse(eachurl)
            print eachurl.netloc

Thanks

Upvotes: 1

Views: 175

Answers (2)

Ken
Ken

Reputation: 1858

The multiprocessing library is probably best for your example. It looks like your code could be rewritten to be:

from urlparse import urlparse

nprocs = 2 # nprocs is the number of processes to run
ParsePool = Pool(nprocs)
ParsedURLS = ParsePool.map(urlparse,URLS)

The map function is the same as the built-in map function, but runs a separate process for each function call.

See http://docs.python.org/library/multiprocessing.html for more on multiprocessing.

Upvotes: 1

Multimedia Mike
Multimedia Mike

Reputation: 13216

I would recommend Python's multiprocessing library. In particular, study the section labeled "Using a pool of workers". It should be pretty quick to rework the above code so that it uses all available cores of your system.

One tip, though: Don't print URLs from the pool workers. It is better to pass back the answer to the main process and aggregate them there for printing. Printing from different processes will result in a lot of jumbled, uncoordinated console output.

Upvotes: 1

Related Questions