Reputation: 3846
I am trying to utilize Python's multiprocessing library to quickly run a function using the 8 processing cores I have on a Linux VM I created. As a test, I am getting the time in seconds it takes for a worker pool with 4 processes to run a function, and the time it takes running the same function without using a worker pool. The time in seconds is coming out as about the same, in some case it is taking the worker pool much longer to process than without.
Script
import requests
import datetime
import multiprocessing as mp
shared_results = []
def stress_test_url(url):
print('Starting Stress Test')
count = 0
while count <= 200:
response = requests.get(url)
shared_results.append(response.status_code)
count += 1
pool = mp.Pool(processes=4)
now = datetime.datetime.now()
results = pool.apply(stress_test_url, args=(url,))
diff = (datetime.datetime.now() - now).total_seconds()
now = datetime.datetime.now()
results = stress_test_url(url)
diff2 = (datetime.datetime.now() - now).total_seconds()
print(diff)
print(diff2)
Terminal Output
Starting Stress Test
Starting Stress Test
44.316212
41.874116
Upvotes: 1
Views: 120
Reputation: 15040
The apply
function of multiprocessing.Pool
simply runs a function in a separate process and waits for its results. It takes a little bit more than running sequentially as it needs to pack the job to be processed and ship it to the child process via a pipe
.
multiprocessing
doesn't make sequential operations faster, it simply allows them to be run in parallel if you hardware has more than one core.
Just try this:
urls = ["http://google.com",
"http://example.com",
"http://stackoverflow.com",
"http://python.org"]
results = pool.map(stress_test_url, urls)
You will see that the 4 URLs get visited seemingly at the same time. This means your logic reduces the amount of time necessary to visit N websites to N / processes
.
Lastly, benchmarking a function which performs an HTTP request is a very poor way to measure performance as networks are unreliable. You will hardly get two executions which take the same amount of time no matter whether you use multiprocessing
or not.
Upvotes: 2