Peque
Peque

Reputation: 14851

Weird multiprocessing block importing Numba function

Environment

Initial setup (works fine)

Two files main.py and numbamodule.py:

main.py

Which spawns 2 processes to run the execute_numba function.

import time
from importlib import import_module
from multiprocessing import Process


def execute_numba(name):
    # Import the function
    importfunction = 'numbamodule.numba_function'
    module = import_module(importfunction.split('.')[0])
    function = getattr(module, importfunction.split('.')[-1])
    while True:
        print(str(name) + ' - executing Numba function...')
        # Execute the function
        function(10)
        time.sleep(0.1)


if __name__ == '__main__':
    processes = [Process(target=execute_numba, args=(i,)) for i in range(2)]
    [p.start() for p in processes]
    time.sleep(1)
    [p.terminate() for p in processes]

numbamodule.py

Which defines a simple function numba_function:

import numba


@numba.jit()
def numba_function(x):
    total = 0
    for i in range(x):
        total += i
    return total

I can run the main.py script and see both processes printing:

$ python main.py
0 - executing Numba function...
1 - executing Numba function...
0 - executing Numba function...
1 - executing Numba function...
0 - executing Numba function...
1 - executing Numba function...
[...]

Breaking it

The way I break it is a bit weird, but this is what I stumbled upon when trying to minimize a reproducible test case. Please, tell me if you can reproduce the same behavior too.

In main.py I just add one of the proposed (bellow) imports after the last Process import (i.e.: uncomment one line and try):

import time
from importlib import import_module
from multiprocessing import Process

#
# Adding one of the import lines bellow results in a block...
# (you may need to install the packages first in the virtual environment)
#
#import matplotlib
#import Pyro4
#import scipy
#import dill


def execute_numba(name):
# [...]

Then one process may block at execute_numba function (in particular at the import_module() call):

$ python main.py 
1 - executing Numba function...
1 - executing Numba function...
1 - executing Numba function...
1 - executing Numba function...
1 - executing Numba function...
1 - executing Numba function...
[...]

For me, matplotlib and Pyro4 imports "work" the best. I cannot even get the block a 100% of the runs... :-/

Note that I am simply adding a single import line, not actually using the package. Some other external imports result in a block as well, but I have found that the ones proposed above "work" best (block the most).

What is happening?

First of all, can you reproduce the same behavior? (specially interested in non-virtualized GNU/Linux machines)

I don't know how to debug this or why could this be happening. Any ideas?

The fact that adding one random import xxx triggers the block scares me and makes little sense to me. Could this be dependent on timing/delays and that is why some imports break it and some others do not?

Notes

Updates

Upvotes: 17

Views: 1102

Answers (3)

Peque
Peque

Reputation: 14851

It seems it was a Numba bug, acknowledged in issue 2431.

It seems to be fixed now. If you bump into this, update your numba and llvmlite installations. If that does not fix the problem, you probably should add a comment in that issue to reopen it.

As @stuartarchibald commented:

[...] it looks like one processed is blocked is because it has in actual fact segfaulted [...]

[...] Segfaults appearing from this location are almost always due to threads performing concurrent operations inside LLVM, or some issue to do with installing functions during Numba's initialisation sequence. [...]

[...] cannot reproduce any more with llvmlite==0.22.0dev0 and numba==0.37.0.dev [...]

Upvotes: 1

devnull
devnull

Reputation: 56

This only applies to matplotlib debugging and is really guessing but might help you a bit to narrow down the problem.

You can start your program, when including matplotlib, with:

python main.py --verbose-helpful

which shows you debug output on the matplotlib initialization. Since it sounded like an issue that is only present on your particular system, there might be some configuartion issue with matplotlibrc configured in such a way that it starts in interactive mode.

Here is an overview on the available debug modes: https://matplotlib.org/users/customizing.html

Upvotes: 0

saaj
saaj

Reputation: 25273

Here's reproduction on official Python Docker environment. Dockerfile follows (put along your .py files).

FROM python:3.5

RUN pip install numba matplotlib pyro4

ADD . /opt
WORKDIR /opt

CMD python main.py

Then:

docker build -t so-44764520 .
docker run --rm -it so-44764520

Both work the same way, without the "working" imports, matplotlib and Pyro4, and with them in main.py.

Upvotes: -1

Related Questions