Reputation: 3562
Here's a simple script to do a job in parallel:
import multiprocessing as mp
def f(x):
return x+1
pool = mp.Pool(2)
res = pool.map(f, range(10))
pool.close()
print(res)
It used to just work. Lately, it doesn't. I don't know what changed, perhaps a python update?
EDIT: it works fine in python 3.7.4, but not in 3.8.3
When I run it from ipython (using spyder, specifically), I get the following, also ad infinitum:
Process SpawnPoolWorker-1:
Traceback (most recent call last):
File "/opt/anaconda3/envs/two_step_line/lib/python3.8/multiprocessing/process.py", line 315, in _bootstrap
self.run()
File "/opt/anaconda3/envs/two_step_line/lib/python3.8/multiprocessing/process.py", line 108, in run
self._target(*self._args, **self._kwargs)
File "/opt/anaconda3/envs/two_step_line/lib/python3.8/multiprocessing/pool.py", line 114, in worker
task = get()
File "/opt/anaconda3/envs/two_step_line/lib/python3.8/multiprocessing/queues.py", line 358, in get
return _ForkingPickler.loads(res)
AttributeError: Can't get attribute 'f' on <module '__main__' (built-in)>
Process SpawnPoolWorker-2:
Traceback (most recent call last):
File "/opt/anaconda3/envs/two_step_line/lib/python3.8/multiprocessing/process.py", line 315, in _bootstrap
self.run()
File "/opt/anaconda3/envs/two_step_line/lib/python3.8/multiprocessing/process.py", line 108, in run
self._target(*self._args, **self._kwargs)
File "/opt/anaconda3/envs/two_step_line/lib/python3.8/multiprocessing/pool.py", line 114, in worker
task = get()
File "/opt/anaconda3/envs/two_step_line/lib/python3.8/multiprocessing/queues.py", line 358, in get
return _ForkingPickler.loads(res)
AttributeError: Can't get attribute 'f' on <module '__main__' (built-in)>
Process SpawnPoolWorker-3:
This used to just work. I'm on python 3.8.3 on a mac, and the script works on python 3.7.4.
More importantly, how can I fix this?
EDIT2: I figured out that I can wrap it in
if __name__ == "__main__":
pool...
And it will work find from the command line if I save my script to a .py file. BUT IT DOES NOT WORK INTERACTVELY. I usually do my development interactively, and this change is annoying. Does anyone know how to run simple mp
loops interactively in python 3.8.3?
Edit3: Apparently the problem stems from the fact that multiprocessing in 3.8 now does spawn
rather than fork
by default on mac, per this. Forking is "unsafe" for some reason. I don't follow the discussion, but a simple, potentially "unsafe" workaround is to
mp.set_start_method('fork')
Upvotes: 1
Views: 644
Reputation: 161
After a lot of digging and several dead ends I put together something that works! With only a minor restructuring of our code we can get interactive multiprocessing in Spyder4.
First, we need to ensure we are using the IPython console. We can confirm this by restarting the console in Spyder and verifying "IPython 7.xx.x" is printed.
Second, we need to move the multiprocessing code we want to be interactive into another file and wrap it in main(). Here is an example of what that may look like (ignore the warning about varFromInteractive
for now- that will be coming from the interactive code cell's part):
from multiprocessing import Pool
def f(a):
return a**2
def multiprocessTest():
with Pool(5) as p:
out = p.map(f, range(varFromInteractive))
return out
def main():
print('executing test.py')
testing = multiprocessTest()
print('multiprocessing results: ', testing)
varFromTest = 'it worked!'
if __name__ == "__main__":
main()
Let us say that we saved the above multiprocessing code in the file test.py.
Next, in our interactive file where we have individual cells we wish to run (cells are created by typing # %%
), we can put:
# %%
from IPython import get_ipython
# %%
varFromInteractive = 10
get_ipython().run_line_magic('run', '-i test.py')
# %%
print(varFromTest)
When running these three cells I received the output:
runcell(19, 'C:/Users/redacted/exploration.py')
runcell(27, 'C:/Users/redacted/exploration.py')
executing test.py
multiprocessing results: [0, 1, 4, 9, 16, 25, 36, 49, 64, 81]
runcell(28, 'C:/Users/redacted/exploration.py')
it worked!
Which is awesome, because not only is the multiprocessed code executing, but the multiprocessed code in test.py has access to the variable varFromInteractive
and the interactive code in a separate file has access to the variable varFromTest
!
So what is the magic making this work? Well, IPython magic, incidentally.
The line get_ipython().run_line_magic('run','-i test.py')
tells the IPython shell to execute the code contained in the file test.py
, with the -i
argument meaning "run the file in IPython’s namespace instead of an empty one" [source]. Further, "The file is executed in a namespace initially consisting only of __name__=='__main__' and sys.argv constructed as indicated" [same source].
From what I understand, Spyder used to support IPython magic natively in code cells, so one could write %run -i test.py
. This is no longer the case, as described in this github issue:
IPython magics are not valid Python code, so we decided to not support them anymore in Python files. That will avoid the common problem of files that work in Spyder but not outside of it.
The poster of that information, ccordoba12, also gave the example of how to use IPython magic in Spyder4 as:
from IPython import get_ipython
get_ipython().magic('who print')
That information was combined with the IPython docs linked above and these IPython docs to produce a solution.
The poster of the question uses a Mac, and I use a Windows PC, but I hope the solution translates.
Upvotes: 0
Reputation: 161
I have been working on multiprocessing using Pool in Spyder the past few days and I ran into an issue similar to yours. I was not able to run any multiprocessed code interactively (as a code block), but I was able to get multiprocessed code running with:
from multiprocessing import Pool
def main():
with Pool(5) as p:
...do multiprocessing here
if __name__ == "__main__":
main()
When I ran the file by pressing the green play button (or F5) the multiprocessed code properly executed.
Upvotes: 1