Reputation: 1

TypingError: Failed in nopython mode pipeline (step: nopython frontend) in numba

I am trying to write my first function for parallelization

But the problem is, that I get the following error:

possible_EVENTS = np.array([""])
@njit(parallel=True,nopython=True)
def add_History(events,EVENTS):
    index=events.where(events=="Repair")
    add=np.array("")
    pre_events=events[index:]
    if ("LOMT" in pre_events) & ("DIMT" in pre_events):
        for i in pre_events:
            if "FU" not in i:
                add.append(i)
            else:
                break
    if len(add)>=5:
        possible_EVENTS .append(add)

for i in tqdm(packages):
    events=np.array(df_["Event"].loc[df_["Packages"]==i].values)
    add_History(events,possible_EVENTS )

But I get this error.

TypingError                               Traceback (most recent call last)
C:\Users\local_PIETAPA\Temp\ipykernel_16492\3239905456.py in <module>
     16 for container in tqdm(container_search):
     17     events=np.array(df_["EVENT_CODE"].loc[df_["UNIT"]==container].values)
---> 18     add_EPOS_History(events,EPOS_EVENTS)

C:\ProgramData\Anaconda3\lib\site-packages\numba\core\dispatcher.py in _compile_for_args(self, *args, **kws)
    466                 e.patch_message(msg)
    467 
--> 468             error_rewrite(e, 'typing')
    469         except errors.UnsupportedError as e:
    470             # Something unsupported is present in the user code, add help info

C:\ProgramData\Anaconda3\lib\site-packages\numba\core\dispatcher.py in error_rewrite(e, issue_type)
    407                 raise e
    408             else:
--> 409                 raise e.with_traceback(None)
    410 
    411         argtypes = []

TypingError: Failed in nopython mode pipeline (step: nopython frontend)
non-precise type array(pyobject, 1d, C)
During: typing of argument at C:\Users\local_PIETAPA\Temp\ipykernel_16492\3239905456.py (4)

File "..\..\..\local_PIETAPA\Temp\ipykernel_16492\3239905456.py", line 4:
<source missing, REPL/exec in use?>

And I have no idea how to fix it. Can you help me?

Upvotes: 0

Answers (2)

Amit Choudhary

Reputation: 9

The error in the code is related to the use of possible_EVENTS. It is defined as a numpy array with a single string element, [""]. However, in the add_History function, you are trying to append to this array, which is not possible with numpy arrays.

You can resolve this issue by converting the numpy array to a Python list, appending the elements to the list, and converting the list back to a numpy array.

Here's an updated version of the code:

import numpy as np
from numba import njit

possible_EVENTS = [""]

@njit(parallel=True,nopython=True)
def add_History(events,EVENTS):
    index=events.where(events=="Repair")
    add=[]
    pre_events=events[index:]
    if ("LOMT" in pre_events) & ("DIMT" in pre_events):
        for i in pre_events:
            if "FU" not in i:
                add.append(i)
            else:
                break
    if len(add)>=5:
        possible_EVENTS.append(add)

for i in tqdm(packages):
    events=np.array(df_["Event"].loc[df_["Packages"]==i].values)
    add_History(events,possible_EVENTS)
possible_EVENTS = np.array(possible_EVENTS)

Note that using the njit decorator with parallel=True may not always result in faster execution, and it is recommended to test the performance with and without parallelization to determine the best approach.

Upvotes: 0

Jérôme Richard

Reputation: 50308

Numba complains because you are providing it dynamic Python object which are not supported by Numba (and cannot be efficiently supported by any similar tools). This is certainly because events is an array of object and not an array of string. You need to ensure the input dtype is of type string, not object.

There are many other issues in the code:

Note that strings are not yet efficiently supported by Numba.
Also please note that parallel=True will not automagically parallelise your function. You need to use prange for that and check it can work (ie. it not introduces bugs like race conditions). The thing is you cannot easily use prange here because of the append so parallel=True is currently useless.
Using add.append(i) is a bad idea here since add is a Numpy array. Doing that cause the execution time to be quadratic instead of linear because new array is created every time the function is called. This is a common issue. The solution is to append the items to a list and then convert the list to a Numpy array.
Global arrays are assumed to be constant. You should not modify them (this is a very bad in software engineering anyway). The solution is to provide the array to the function and return the you one. Indeed, append creates a new array, it does not modify the current array. I strongly advise you to carefully read the documentation of np.array. In the end, your current function compute nothing, even without Numba! Please check your code before trying to make it faster.
events.where(events=="Repair") travel the whole events array which is less efficient than iterating over the array and break when the item location is found.

Upvotes: 0

TypingError: Failed in nopython mode pipeline (step: nopython frontend) in numba

Answers (2)

Related Questions