Concurrent future fails on Windows

I recently moved to a Windows machine from my mac and I'm trying to import all my script I created while working on the mac. In particular there is one who's not working that I really need to fix.

The problem with this code is that it use concurrent.futures to spawn many child process and, while on the mac it works fine (just tested on the other computer), the same code on windows gives me the following error.

Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "C:\Users\stram\Anaconda3\lib\multiprocessing\spawn.py", line 105, in spawn_main
    exitcode = _main(fd)
  File "C:\Users\stram\Anaconda3\lib\multiprocessing\spawn.py", line 114, in _main
    prepare(preparation_data)
  File "C:\Users\stram\Anaconda3\lib\multiprocessing\spawn.py", line 225, in prepare
    _fixup_main_from_path(data['init_main_from_path'])
  File "C:\Users\stram\Anaconda3\lib\multiprocessing\spawn.py", line 277, in _fixup_main_from_path
    run_name="__mp_main__")
  File "C:\Users\stram\Anaconda3\lib\runpy.py", line 263, in run_path
    pkg_name=pkg_name, script_name=fname)
  File "C:\Users\stram\Anaconda3\lib\runpy.py", line 96, in _run_module_code
    mod_name, mod_spec, pkg_name, script_name)
  File "C:\Users\stram\Anaconda3\lib\runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "D:\Lavoro\Dottorato\3_klip_tiles.py", line 188, in <module>
    mk_postagestamps(MainID_list,len(MainID_list),counts_df,header_df,path2dir,list_of_filters,fk_pos,workers,KLIPmodes=KLIPmodes) #build the cube dataframe form the selected UniqueID stars
  File "D:\Lavoro\Dottorato\3_klip_tiles.py", line 45, in mk_postagestamps
    for variable in executor.map(task,MainID_list,repeat(counts_df),repeat(header_df),repeat(KLIPmodes),repeat(fk_pos),repeat(filters),repeat(path2dir),chunksize=chunksize): #the task routine is where everything is done!
  File "C:\Users\stram\Anaconda3\lib\concurrent\futures\process.py", line 645, in map
    timeout=timeout)
  File "C:\Users\stram\Anaconda3\lib\concurrent\futures\_base.py", line 575, in map
    fs = [self.submit(fn, *args) for args in zip(*iterables)]
  File "C:\Users\stram\Anaconda3\lib\concurrent\futures\_base.py", line 575, in <listcomp>
    fs = [self.submit(fn, *args) for args in zip(*iterables)]
  File "C:\Users\stram\Anaconda3\lib\concurrent\futures\process.py", line 615, in submit
    self._start_queue_management_thread()
  File "C:\Users\stram\Anaconda3\lib\concurrent\futures\process.py", line 569, in _start_queue_management_thread
    self._adjust_process_count()
  File "C:\Users\stram\Anaconda3\lib\concurrent\futures\process.py", line 593, in _adjust_process_count
    p.start()
  File "C:\Users\stram\Anaconda3\lib\multiprocessing\process.py", line 112, in start
    self._popen = self._Popen(self)
  File "C:\Users\stram\Anaconda3\lib\multiprocessing\context.py", line 322, in _Popen
    return Popen(process_obj)
  File "C:\Users\stram\Anaconda3\lib\multiprocessing\popen_spawn_win32.py", line 46, in __init__
    prep_data = spawn.get_preparation_data(process_obj._name)
  File "C:\Users\stram\Anaconda3\lib\multiprocessing\spawn.py", line 143, in get_preparation_data
    _check_not_importing_main()
  File "C:\Users\stram\Anaconda3\lib\multiprocessing\spawn.py", line 136, in _check_not_importing_main
    is not going to be frozen to produce an executable.''')
RuntimeError:
        An attempt has been made to start a new process before the
        current process has finished its bootstrapping phase.

        This probably means that you are not using fork to start your
        child processes and you have forgotten to use the proper idiom
        in the main module:

            if __name__ == '__main__':
                freeze_support()
                ...

        The "freeze_support()" line can be omitted if the program
        is not going to be frozen to produce an executable.
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "C:\Users\stram\Anaconda3\lib\multiprocessing\spawn.py", line 105, in spawn_main
    exitcode = _main(fd)
  File "C:\Users\stram\Anaconda3\lib\multiprocessing\spawn.py", line 114, in _main
    prepare(preparation_data)
  File "C:\Users\stram\Anaconda3\lib\multiprocessing\spawn.py", line 225, in prepare
    _fixup_main_from_path(data['init_main_from_path'])
  File "C:\Users\stram\Anaconda3\lib\multiprocessing\spawn.py", line 277, in _fixup_main_from_path
    run_name="__mp_main__")
  File "C:\Users\stram\Anaconda3\lib\runpy.py", line 263, in run_path
    pkg_name=pkg_name, script_name=fname)
  File "C:\Users\stram\Anaconda3\lib\runpy.py", line 96, in _run_module_code
    mod_name, mod_spec, pkg_name, script_name)
  File "C:\Users\stram\Anaconda3\lib\runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "D:\Lavoro\Dottorato\3_klip_tiles.py", line 188, in <module>
    mk_postagestamps(MainID_list,len(MainID_list),counts_df,header_df,path2dir,list_of_filters,fk_pos,workers,KLIPmodes=KLIPmodes) #build the cube dataframe form the selected UniqueID stars
  File "D:\Lavoro\Dottorato\3_klip_tiles.py", line 45, in mk_postagestamps
    for variable in executor.map(task,MainID_list,repeat(counts_df),repeat(header_df),repeat(KLIPmodes),repeat(fk_pos),repeat(filters),repeat(path2dir),chunksize=chunksize): #the task routine is where everything is done!
  File "C:\Users\stram\Anaconda3\lib\concurrent\futures\process.py", line 645, in map
    timeout=timeout)
  File "C:\Users\stram\Anaconda3\lib\concurrent\futures\_base.py", line 575, in map
    fs = [self.submit(fn, *args) for args in zip(*iterables)]
  File "C:\Users\stram\Anaconda3\lib\concurrent\futures\_base.py", line 575, in <listcomp>
    fs = [self.submit(fn, *args) for args in zip(*iterables)]
  File "C:\Users\stram\Anaconda3\lib\concurrent\futures\process.py", line 615, in submit
    self._start_queue_management_thread()
  File "C:\Users\stram\Anaconda3\lib\concurrent\futures\process.py", line 569, in _start_queue_management_thread
    self._adjust_process_count()
  File "C:\Users\stram\Anaconda3\lib\concurrent\futures\process.py", line 593, in _adjust_process_count
    p.start()
  File "C:\Users\stram\Anaconda3\lib\multiprocessing\process.py", line 112, in start
Traceback (most recent call last):
    self._popen = self._Popen(self)
  File "<string>", line 1, in <module>
  File "C:\Users\stram\Anaconda3\lib\multiprocessing\context.py", line 322, in _Popen
  File "C:\Users\stram\Anaconda3\lib\multiprocessing\spawn.py", line 105, in spawn_main
    return Popen(process_obj)
    exitcode = _main(fd)
  File "C:\Users\stram\Anaconda3\lib\multiprocessing\popen_spawn_win32.py", line 46, in __init__
  File "C:\Users\stram\Anaconda3\lib\multiprocessing\spawn.py", line 114, in _main
    prep_data = spawn.get_preparation_data(process_obj._name)
    prepare(preparation_data)
  File "C:\Users\stram\Anaconda3\lib\multiprocessing\spawn.py", line 143, in get_preparation_data
  File "C:\Users\stram\Anaconda3\lib\multiprocessing\spawn.py", line 225, in prepare
    _check_not_importing_main()
    _fixup_main_from_path(data['init_main_from_path'])
  File "C:\Users\stram\Anaconda3\lib\multiprocessing\spawn.py", line 136, in _check_not_importing_main
  File "C:\Users\stram\Anaconda3\lib\multiprocessing\spawn.py", line 277, in _fixup_main_from_path
    is not going to be frozen to produce an executable.''')
    run_name="__mp_main__")
RuntimeError:
        An attempt has been made to start a new process before the
        current process has finished its bootstrapping phase.

        This probably means that you are not using fork to start your
        child processes and you have forgotten to use the proper idiom
        in the main module:

            if __name__ == '__main__':
                freeze_support()
                ...

        The "freeze_support()" line can be omitted if the program
        is not going to be frozen to produce an executable.  File "C:\Users\stram\Anaconda3\lib\runpy.py", line 263, in run_path

    pkg_name=pkg_name, script_name=fname)
  File "C:\Users\stram\Anaconda3\lib\runpy.py", line 96, in _run_module_code
    mod_name, mod_spec, pkg_name, script_name)
  File "C:\Users\stram\Anaconda3\lib\runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "D:\Lavoro\Dottorato\3_klip_tiles.py", line 188, in <module>
    mk_postagestamps(MainID_list,len(MainID_list),counts_df,header_df,path2dir,list_of_filters,fk_pos,workers,KLIPmodes=KLIPmodes) #build the cube dataframe form the selected UniqueID stars
Traceback (most recent call last):
  File "D:\Lavoro\Dottorato\3_klip_tiles.py", line 45, in mk_postagestamps
  File "<string>", line 1, in <module>
    for variable in executor.map(task,MainID_list,repeat(counts_df),repeat(header_df),repeat(KLIPmodes),repeat(fk_pos),repeat(filters),repeat(path2dir),chunksize=chunksize): #the task routine is where everything is done!
  File "C:\Users\stram\Anaconda3\lib\concurrent\futures\process.py", line 645, in map
  File "C:\Users\stram\Anaconda3\lib\multiprocessing\spawn.py", line 105, in spawn_main
    timeout=timeout)
    exitcode = _main(fd)
  File "C:\Users\stram\Anaconda3\lib\concurrent\futures\_base.py", line 575, in map
  File "C:\Users\stram\Anaconda3\lib\multiprocessing\spawn.py", line 114, in _main
    fs = [self.submit(fn, *args) for args in zip(*iterables)]
    prepare(preparation_data)
  File "C:\Users\stram\Anaconda3\lib\concurrent\futures\_base.py", line 575, in <listcomp>
  File "C:\Users\stram\Anaconda3\lib\multiprocessing\spawn.py", line 225, in prepare
    fs = [self.submit(fn, *args) for args in zip(*iterables)]
    _fixup_main_from_path(data['init_main_from_path'])
  File "C:\Users\stram\Anaconda3\lib\concurrent\futures\process.py", line 615, in submit
  File "C:\Users\stram\Anaconda3\lib\multiprocessing\spawn.py", line 277, in _fixup_main_from_path
    self._start_queue_management_thread()
  File "C:\Users\stram\Anaconda3\lib\concurrent\futures\process.py", line 569, in _start_queue_management_thread
    run_name="__mp_main__")
  File "C:\Users\stram\Anaconda3\lib\runpy.py", line 263, in run_path
    self._adjust_process_count()
    pkg_name=pkg_name, script_name=fname)
  File "C:\Users\stram\Anaconda3\lib\concurrent\futures\process.py", line 593, in _adjust_process_count
  File "C:\Users\stram\Anaconda3\lib\runpy.py", line 96, in _run_module_code
    mod_name, mod_spec, pkg_name, script_name)
  File "C:\Users\stram\Anaconda3\lib\runpy.py", line 85, in _run_code
    p.start()
    exec(code, run_globals)
  File "C:\Users\stram\Anaconda3\lib\multiprocessing\process.py", line 112, in start
  File "D:\Lavoro\Dottorato\3_klip_tiles.py", line 188, in <module>
    self._popen = self._Popen(self)
    mk_postagestamps(MainID_list,len(MainID_list),counts_df,header_df,path2dir,list_of_filters,fk_pos,workers,KLIPmodes=KLIPmodes) #build the cube dataframe form the selected UniqueID stars
  File "C:\Users\stram\Anaconda3\lib\multiprocessing\context.py", line 322, in _Popen
  File "D:\Lavoro\Dottorato\3_klip_tiles.py", line 45, in mk_postagestamps
    return Popen(process_obj)
    for variable in executor.map(task,MainID_list,repeat(counts_df),repeat(header_df),repeat(KLIPmodes),repeat(fk_pos),repeat(filters),repeat(path2dir),chunksize=chunksize): #the task routine is where everything is done!
  File "C:\Users\stram\Anaconda3\lib\multiprocessing\popen_spawn_win32.py", line 46, in __init__
  File "C:\Users\stram\Anaconda3\lib\concurrent\futures\process.py", line 645, in map
    prep_data = spawn.get_preparation_data(process_obj._name)
  File "C:\Users\stram\Anaconda3\lib\multiprocessing\spawn.py", line 143, in get_preparation_data
    timeout=timeout)
    _check_not_importing_main()
  File "C:\Users\stram\Anaconda3\lib\concurrent\futures\_base.py", line 575, in map
  File "C:\Users\stram\Anaconda3\lib\multiprocessing\spawn.py", line 136, in _check_not_importing_main
    fs = [self.submit(fn, *args) for args in zip(*iterables)]
    is not going to be frozen to produce an executable.''')
  File "C:\Users\stram\Anaconda3\lib\concurrent\futures\_base.py", line 575, in <listcomp>
RuntimeError:
        An attempt has been made to start a new process before the
        current process has finished its bootstrapping phase.

        This probably means that you are not using fork to start your
        child processes and you have forgotten to use the proper idiom
        in the main module:

            if __name__ == '__main__':
                freeze_support()
                ...

        The "freeze_support()" line can be omitted if the program
        is not going to be frozen to produce an executable.
    fs = [self.submit(fn, *args) for args in zip(*iterables)]
  File "C:\Users\stram\Anaconda3\lib\concurrent\futures\process.py", line 615, in submit
    self._start_queue_management_thread()
  File "C:\Users\stram\Anaconda3\lib\concurrent\futures\process.py", line 569, in _start_queue_management_thread
    self._adjust_process_count()
  File "C:\Users\stram\Anaconda3\lib\concurrent\futures\process.py", line 593, in _adjust_process_count
    p.start()
  File "C:\Users\stram\Anaconda3\lib\multiprocessing\process.py", line 112, in start
    self._popen = self._Popen(self)
  File "C:\Users\stram\Anaconda3\lib\multiprocessing\context.py", line 322, in _Popen
    return Popen(process_obj)
  File "C:\Users\stram\Anaconda3\lib\multiprocessing\popen_spawn_win32.py", line 46, in __init__
    prep_data = spawn.get_preparation_data(process_obj._name)
  File "C:\Users\stram\Anaconda3\lib\multiprocessing\spawn.py", line 143, in get_preparation_data
    _check_not_importing_main()
  File "C:\Users\stram\Anaconda3\lib\multiprocessing\spawn.py", line 136, in _check_not_importing_main
    is not going to be frozen to produce an executable.''')
RuntimeError:
        An attempt has been made to start a new process before the
        current process has finished its bootstrapping phase.

        This probably means that you are not using fork to start your
        child processes and you have forgotten to use the proper idiom
        in the main module:

            if __name__ == '__main__':
                freeze_support()
                ...

        The "freeze_support()" line can be omitted if the program
        is not going to be frozen to produce an executable.
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "C:\Users\stram\Anaconda3\lib\multiprocessing\spawn.py", line 105, in spawn_main
    exitcode = _main(fd)
  File "C:\Users\stram\Anaconda3\lib\multiprocessing\spawn.py", line 114, in _main
    prepare(preparation_data)
  File "C:\Users\stram\Anaconda3\lib\multiprocessing\spawn.py", line 225, in prepare
    _fixup_main_from_path(data['init_main_from_path'])
  File "C:\Users\stram\Anaconda3\lib\multiprocessing\spawn.py", line 277, in _fixup_main_from_path
    run_name="__mp_main__")
  File "C:\Users\stram\Anaconda3\lib\runpy.py", line 263, in run_path
    pkg_name=pkg_name, script_name=fname)
  File "C:\Users\stram\Anaconda3\lib\runpy.py", line 96, in _run_module_code
    mod_name, mod_spec, pkg_name, script_name)
  File "C:\Users\stram\Anaconda3\lib\runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "D:\Lavoro\Dottorato\3_klip_tiles.py", line 188, in <module>
    mk_postagestamps(MainID_list,len(MainID_list),counts_df,header_df,path2dir,list_of_filters,fk_pos,workers,KLIPmodes=KLIPmodes) #build the cube dataframe form the selected UniqueID stars
  File "D:\Lavoro\Dottorato\3_klip_tiles.py", line 45, in mk_postagestamps
    for variable in executor.map(task,MainID_list,repeat(counts_df),repeat(header_df),repeat(KLIPmodes),repeat(fk_pos),repeat(filters),repeat(path2dir),chunksize=chunksize): #the task routine is where everything is done!
  File "C:\Users\stram\Anaconda3\lib\concurrent\futures\process.py", line 645, in map
    timeout=timeout)
  File "C:\Users\stram\Anaconda3\lib\concurrent\futures\_base.py", line 575, in map
    fs = [self.submit(fn, *args) for args in zip(*iterables)]
  File "C:\Users\stram\Anaconda3\lib\concurrent\futures\_base.py", line 575, in <listcomp>
    fs = [self.submit(fn, *args) for args in zip(*iterables)]
  File "C:\Users\stram\Anaconda3\lib\concurrent\futures\process.py", line 615, in submit
    self._start_queue_management_thread()
  File "C:\Users\stram\Anaconda3\lib\concurrent\futures\process.py", line 569, in _start_queue_management_thread
    self._adjust_process_count()
  File "C:\Users\stram\Anaconda3\lib\concurrent\futures\process.py", line 593, in _adjust_process_count
    p.start()
  File "C:\Users\stram\Anaconda3\lib\multiprocessing\process.py", line 112, in start
    self._popen = self._Popen(self)
  File "C:\Users\stram\Anaconda3\lib\multiprocessing\context.py", line 322, in _Popen
    return Popen(process_obj)
  File "C:\Users\stram\Anaconda3\lib\multiprocessing\popen_spawn_win32.py", line 46, in __init__
    prep_data = spawn.get_preparation_data(process_obj._name)
  File "C:\Users\stram\Anaconda3\lib\multiprocessing\spawn.py", line 143, in get_preparation_data
    _check_not_importing_main()
  File "C:\Users\stram\Anaconda3\lib\multiprocessing\spawn.py", line 136, in _check_not_importing_main
    is not going to be frozen to produce an executable.''')
RuntimeError:
        An attempt has been made to start a new process before the
        current process has finished its bootstrapping phase.

        This probably means that you are not using fork to start your
        child processes and you have forgotten to use the proper idiom
        in the main module:

            if __name__ == '__main__':
                freeze_support()
                ...

        The "freeze_support()" line can be omitted if the program
        is not going to be frozen to produce an executable.

Edit:

I applied the suggestion in the output adding the:

if __name__ == '__main__':
    freeze_support()

at the beginning of the code but I get a new error now. What is it the attribute __spec__ that says is missing? I am running the code from the Anaconda Powershell promt

AttributeError                            
Traceback (most recent call last)
D:\Lavoro\Dottorato\3_klip_tiles.py in <module>
    199 if __name__ == '__main__':
    200     freeze_support()
--> 201     main()

D:\Lavoro\Dottorato\3_klip_tiles.py in main()
    188     if MainID_sel!=None:MainID_list=[int(i) for i in MainID_sel.split(',')]
    189
--> 190     mk_postagestamps(MainID_list,len(MainID_list),counts_df,header_df,path2dir,list_of_filters,fk_pos,workers,KLIPmodes=KLIPmodes,showplot=showplot) #build the cube dataframe form the selected UniqueID stars
    191
    192     header_df.to_hdf(file_path,'header',mode='w',table=True)

D:\Lavoro\Dottorato\3_klip_tiles.py in mk_postagestamps(MainID_list, ntarget, counts_df, header_df, path2dir, filters, fk_pos, workers, showplot, use_std, KLIPmodes)
     44         chunksize = ntarget // num_of_chunks
     45         if chunksize <=0: chunksize=1
---> 46         for variable in executor.map(task,MainID_list,repeat(counts_df),repeat(header_df),repeat(KLIPmodes),repeat(fk_pos),repeat(filters),repeat(path2dir),repeat(showplot),chunksize=chunksize): #the task routine is where everything is done!
     47             continue
     48     ##########################################################

~\Anaconda3\lib\concurrent\futures\process.py in map(self, fn, timeout, chunksize, *iterables)
    643         results = super().map(partial(_process_chunk, fn),
    644                               _get_chunks(*iterables, chunksize=chunksize),
--> 645                               timeout=timeout)
    646         return _chain_from_iterable_of_lists(results)
    647

~\Anaconda3\lib\concurrent\futures\_base.py in map(self, fn, timeout, chunksize, *iterables)
    573             end_time = timeout + time.monotonic()
    574
--> 575         fs = [self.submit(fn, *args) for args in zip(*iterables)]
    576
    577         # Yield must be hidden in closure so that the futures are submitted

~\Anaconda3\lib\concurrent\futures\_base.py in <listcomp>(.0)
    573             end_time = timeout + time.monotonic()
    574
--> 575         fs = [self.submit(fn, *args) for args in zip(*iterables)]
    576
    577         # Yield must be hidden in closure so that the futures are submitted

~\Anaconda3\lib\concurrent\futures\process.py in submit(self, fn, *args, **kwargs)
    613             self._queue_management_thread_wakeup.wakeup()
    614
--> 615             self._start_queue_management_thread()
    616             return f
    617     submit.__doc__ = _base.Executor.submit.__doc__

~\Anaconda3\lib\concurrent\futures\process.py in _start_queue_management_thread(self)
    567                 thread_wakeup.wakeup()
    568             # Start the processes so that their sentinels are known.
--> 569             self._adjust_process_count()
    570             self._queue_management_thread = threading.Thread(
    571                 target=_queue_management_worker,

~\Anaconda3\lib\concurrent\futures\process.py in _adjust_process_count(self)
    591                       self._initializer,
    592                       self._initargs))
--> 593             p.start()
    594             self._processes[p.pid] = p
    595

~\Anaconda3\lib\multiprocessing\process.py in start(self)
    110                'daemonic processes are not allowed to have children'
    111         _cleanup()
--> 112         self._popen = self._Popen(self)
    113         self._sentinel = self._popen.sentinel
    114         # Avoid a refcycle if the target function holds an indirect

~\Anaconda3\lib\multiprocessing\context.py in _Popen(process_obj)
    320         def _Popen(process_obj):
    321             from .popen_spawn_win32 import Popen
--> 322             return Popen(process_obj)
    323
    324     class SpawnContext(BaseContext):

~\Anaconda3\lib\multiprocessing\popen_spawn_win32.py in __init__(self, process_obj)
     44
     45     def __init__(self, process_obj):
---> 46         prep_data = spawn.get_preparation_data(process_obj._name)
     47
     48         # read end of pipe will be "stolen" by the child process

~\Anaconda3\lib\multiprocessing\spawn.py in get_preparation_data(name)
    170     # or through direct execution (or to leave it alone entirely)
    171     main_module = sys.modules['__main__']
--> 172     main_mod_name = getattr(main_module.__spec__, "name", None)
    173     if main_mod_name is not None:
    174         d['init_main_from_name'] = main_mod_name

AttributeError: module '__main__' has no attribute '__spec__'

Edit 2:

So in the end I think I solved it. I found this: Python Multiprocessing error: AttributeError: module '__main__' has no attribute '__spec__'

and following the suggestion there I modified the routine as follow

if __name__ == '__main__':
    __spec__ = None
    freeze_support()
    main()

this works now (I still don't know why it was failing before though so if anyone want to explain I'll really appreciate).

Upvotes: 0

Views: 3036

Answers (1)

Roland Smith
Roland Smith

Reputation: 43505

Multiprocessing works differently on ms-windows because that OS lacks the fork system call used on UNIX and macOS.

fork creates the child process as a perfect copy of the parent process. All the code and data in both processes are the same. The only difference being the return value of the fork call. (That is to let the new process know it is a copy.) So the child process has access to (a copy of) all the data in the parent process.

On ms-windows, multiprocessing tries to "fake" fork by launching a new python interpreter and have it import your main module. This means (among other things) that your main module has to be importable without side effects such as starting another process. Hence the reason for if __name__ == '__main__'. It also means that your worker processes might or might not have access to data created in the parent process, depending on where it is created. It will have access to anything created before __main__. But it would not have access to anything created inside the main block.

Upvotes: 2

Related Questions