How call a `@guvectorize` inside a `@guvectorize` in numba?

Question

I'm trying to call a @guvectorize inside a @guvectorize but I have an error saying :

Untyped global name 'regNL_nb': cannot determine Numba type of 

File "di.py", line 12:
def H2Delay_nb(S1, S2, R2):
    H1 = regNL_nb(S1, S2)
    ^

here is an MRE:

import numpy as np
from numba import guvectorize, float64, int64, njit, cuda, jit

@guvectorize(["float64[:], float64[:], float64[:]"], '(n),(n)->(n)')
def regNL_nb(S1, S2, h2):
    for i in range(len(S1)):
        h2[i] = S1[i] + S2[i]

@guvectorize(["float64[:], float64[:],  float64[:]"], '(n),(n)->(n)',nopython=True)
def H2Delay_nb(S1, S2, R2):
    H1 = regNL_nb(S1, S2)
    H2 = regNL_nb(S1, S2,)
    for i in range(len(S1)):
        R2[i] =  H1[i] + H2[i]


S1 = np.array([1,2,3,4,5,6,7,8,9])
S2 = np.array([1,2,3,4,5,6,7,8,9])
H2 = H2Delay_nb(S1, S2)
print(H2)

I don't know how do I tell to numba that the function regNL_nb is a guvectorize function.

Arty · Accepted Answer

My answer is only for the case if you're fine with replacing @guvectorize with @njit, it will be totally same code, same fast, just a bit more longer syntax to use.

It is probably some issue with accepting @guvectorize-ed functions inside other guvectorized function in nopython mode.

But Numba accepts perfectly good just regular @njit-ed functions inside other njited. So you may rewrite your function to use @njit, your function signature will remain same as @guvectorize-ed for outside world. @njit version will just need extra usage of np.empty_like(...) + return inside function.

To remind you - all @njit-ed functions are always having nopython mode enabled, so your njited code will be same fast as guvectorize+nopython.

Also I provide CUDA solution as second code snippet.

You may also make @njited only internal helper function, but external you probably can still have as @guvectorize-ed. Also if you want universal function (accepting any inputs) just remove signature 'f8[:](f8[:], f8[:])' from njited definition, signature will be auto-resolved on call.

Final code looks like this:

Try it online!

import numpy as np
from numba import guvectorize, float64, int64, njit, cuda, jit

@njit('f8[:](f8[:], f8[:])', cache = True)
def regNL_nb(S1, S2):
    h2 = np.empty_like(S1)
    for i in range(len(S1)):
        h2[i] = S1[i] + S2[i]
    return h2
        
@njit('f8[:](f8[:], f8[:])', cache = True)
def H2Delay_nb(S1, S2):
    H1 = regNL_nb(S1, S2)
    H2 = regNL_nb(S1, S2)
    R2 = np.empty_like(H1)
    for i in range(len(S1)):
        R2[i] =  H1[i] + H2[i]
    return R2

S1 = np.array([1,2,3,4,5,6,7,8,9], dtype = np.float64)
S2 = np.array([1,2,3,4,5,6,7,8,9], dtype = np.float64)
H2 = H2Delay_nb(S1, S2)
print(H2)

Output:

[ 4.  8. 12. 16. 20. 24. 28. 32. 36.]

CUDA variant of same code, it needs extra functions-wrappers if you want to automatically create and return resulting array, because CUDA-code function doesn't allow to have return value:

import numpy as np
from numba import guvectorize, float64, int64, njit, cuda, jit

@cuda.jit('void(f8[:], f8[:], f8[:])', cache = True)
def regNL_nb_cu(S1, S2, h2):
    for i in range(len(S1)):
        h2[i] = S1[i] + S2[i]
        
@njit('f8[:](f8[:], f8[:])', cache = True)
def regNL_nb(S1, S2):
    h2 = np.empty_like(S1)
    regNL_nb_cu(S1, S2, h2)
    return h2
        
@cuda.jit('void(f8[:], f8[:], f8[:])', cache = True)
def H2Delay_nb_cu(S1, S2, R2):
    H1 = regNL_nb(S1, S2)
    H2 = regNL_nb(S1, S2)
    for i in range(len(S1)):
        R2[i] =  H1[i] + H2[i]
        
@njit('f8[:](f8[:], f8[:])', cache = True)
def H2Delay_nb(S1, S2):
    R2 = np.empty_like(S1)
    H2Delay_nb_cu(S1, S2, R2)
    return R2

S1 = np.array([1,2,3,4,5,6,7,8,9], dtype = np.float64)
S2 = np.array([1,2,3,4,5,6,7,8,9], dtype = np.float64)
H2 = H2Delay_nb(S1, S2)
print(H2)

How call a `@guvectorize` inside a `@guvectorize` in numba?

Answers (2)

Related Questions