Reputation: 119
I am running the below code in Python. The code shows the example of how to use multiprocessing in Python. But it is not printing the result (line 16), Only the dataset (line 9) is getting printed and the kernel keeps on running. I am of the opinion that the results should be produced instantly as the code is using various cores of the cpu through multiprocessing so fast execution of the code should be there. Can someone tell what is the issue??
from multiprocessing import Pool
def square(x):
# calculate the square of the value of x
return x*x
if __name__ == '__main__':
# Define the dataset
dataset = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14]
# Output the dataset
print ('Dataset: ' + str(dataset)) # Line 9
# Run this with a pool of 5 agents having a chunksize of 3 until finished
agents = 5
chunksize = 3
with Pool(processes=agents) as pool:
result = pool.map(square, dataset, chunksize)
# Output the result
print ('Result: ' + str(result)) # Line 16
Upvotes: 1
Views: 93
Reputation: 44128
You are running under Windows. Look at the console output where you started up Jupyter. You will see 5 instances (you have 5 processes in your pool) of:
AttributeError: Can't get attribute 'square' on <module 'main' (built-in)>
You need to put your worker function in a file, for instance, workers.py
, in the same directory as your Jupyter code:
workers.py
def square(x):
# calculate the square of the value of x
return x*xdef square(x):
Then remove the above function from your cell and instead add:
import workers
and then:
with Pool(processes=agents) as pool:
result = pool.map(workers.square, dataset, chunksize)
Note
See my comment(s) to your post concerning the chunksize
argument to the Pool
constructor. The default chunksize
for the map
method when None
is calculated more or less as follows based on the iterable
argument:
if not hasattr(iterable, '__len__'):
iterable = list(iterable)
if chunksize is None:
chunksize, extra = divmod(len(iterable), len(self._pool) * 4)
if extra:
chunksize += 1
So I mispoke in one deatil: Regardless of whether you specify a chunksize
or not, map
will convert your iterable to a list
if it needs to in order to get its length. But essentially the default chunksize
used is the ceiling of the number of jobs being submitted divided by 4 times the pool size.
Upvotes: 1
Reputation: 329
There is a known bug in older versions of python and Windows10 https://bugs.python.org/issue35797
It occurs when using multiprocessing through venv. Bugfix is released in Python 3.7.3.
Upvotes: 0