Reputation: 959
I'm seeing this error when I try to use Multiprocessing in a Python script, running as an Azure Function App. Someone else is facing this same issue here (ModuleNotFoundError: No module named '__app__') but hasn't answered the question regarding him pickleable/non-pickleable methods.
So here is the simplest possible example that gives the error:
init.py
from multiprocessing import Pool
import azure.functions as func
def f(x):
return x*x
def main(mytimer: func.TimerRequest) -> None:
with Pool(5) as p:
print(p.map(f, [1, 2, 3]))
And here is the console output:
func run
[2022-02-19T21:26:37.672Z] Worker process started and initialized.
[2022-02-19T21:26:40.048Z] Executing 'Functions.TCDatesToCT' (Reason='Timer fired at 2022-02-19T16:26:40.0177058-05:00', Id=f632a25f-5621-4e6f-93f0-476ce60a5221)
[2022-02-19T21:26:40.284Z] Process SpawnPoolWorker-1:
[2022-02-19T21:26:40.296Z] Process SpawnPoolWorker-3:
[2022-02-19T21:26:40.298Z] Process SpawnPoolWorker-2:
[2022-02-19T21:26:40.319Z] Traceback (most recent call last):
[2022-02-19T21:26:40.322Z] Traceback (most recent call last):
[2022-02-19T21:26:40.324Z] Traceback (most recent call last):
[2022-02-19T21:26:40.326Z] File "C:\Users\bwarrick\AppData\Local\Programs\Python\Python39\lib\multiprocessing\process.py", line 315, in _bootstrap
[2022-02-19T21:26:40.329Z] self.run()
[2022-02-19T21:26:40.330Z] File "C:\Users\bwarrick\AppData\Local\Programs\Python\Python39\lib\multiprocessing\process.py", line 315, in _bootstrap
[2022-02-19T21:26:40.336Z] self.run()
[2022-02-19T21:26:40.338Z] File "C:\Users\bwarrick\AppData\Local\Programs\Python\Python39\lib\multiprocessing\process.py", line 108, in run
[2022-02-19T21:26:40.344Z] self._target(*self._args, **self._kwargs)
[2022-02-19T21:26:40.346Z] File "C:\Users\bwarrick\AppData\Local\Programs\Python\Python39\lib\multiprocessing\process.py", line 108, in run
[2022-02-19T21:26:40.357Z] self._target(*self._args, **self._kwargs)
[2022-02-19T21:26:40.360Z] File "C:\Users\bwarrick\AppData\Local\Programs\Python\Python39\lib\multiprocessing\process.py", line 315, in _bootstrap
[2022-02-19T21:26:40.365Z] self.run()
[2022-02-19T21:26:40.367Z] File "C:\Users\bwarrick\AppData\Local\Programs\Python\Python39\lib\multiprocessing\pool.py", line 114, in worker
[2022-02-19T21:26:40.373Z] task = get()
[2022-02-19T21:26:40.375Z] File "C:\Users\bwarrick\AppData\Local\Programs\Python\Python39\lib\multiprocessing\process.py", line 108, in run
[2022-02-19T21:26:40.377Z] self._target(*self._args, **self._kwargs)
[2022-02-19T21:26:40.383Z] File "C:\Users\bwarrick\AppData\Local\Programs\Python\Python39\lib\multiprocessing\pool.py", line 114, in worker
[2022-02-19T21:26:40.391Z] task = get()
[2022-02-19T21:26:40.395Z] File "C:\Users\bwarrick\AppData\Local\Programs\Python\Python39\lib\multiprocessing\queues.py", line 368, in get
[2022-02-19T21:26:40.401Z] return _ForkingPickler.loads(res)
[2022-02-19T21:26:40.403Z] File "C:\Users\bwarrick\AppData\Local\Programs\Python\Python39\lib\multiprocessing\pool.py", line 114, in worker
[2022-02-19T21:26:40.405Z] task = get()
[2022-02-19T21:26:40.407Z] File "C:\Users\bwarrick\AppData\Local\Programs\Python\Python39\lib\multiprocessing\queues.py", line 368, in get
[2022-02-19T21:26:40.416Z] return _ForkingPickler.loads(res)
[2022-02-19T21:26:40.419Z] ModuleNotFoundError: No module named '__app__'
[2022-02-19T21:26:40.422Z] File "C:\Users\bwarrick\AppData\Local\Programs\Python\Python39\lib\multiprocessing\queues.py", line 368, in get
[2022-02-19T21:26:40.431Z] return _ForkingPickler.loads(res)
[2022-02-19T21:26:40.432Z] ModuleNotFoundError: No module named '__app__'
[2022-02-19T21:26:40.434Z] ModuleNotFoundError: No module named '__app__'
Usually I use Multiprocessing after a "if name == 'main':" so I think this issue is how the Function App is calling the main() function inside init. I've been trying to figure this out for a couple weeks now. Any ideas are appreciated. Thanks.
Upvotes: 4
Views: 880
Reputation: 31
This happens on Windows machines when the pool tries to create child process. Windows, unlike Linux, doesn't have a fork command and so data has to be pickled and sent to the child process. Functions are only pickleable if they are created in the top most module. To avoid this error, use Linux Azure Function Apps or declare the functions you want to parallelise in the top most module.
Upvotes: 3