Array allocation for CPU and GPU functions in NUMBA

Question

I'm trying to write some functions in numba that I can interchangeably use for different targets (cpu, cuda, parallel). The probelm I'm having is the allocation of a new array is different for cuda device code, e.g.:

cuda.local.array(shape, dtype)

vs. doing something similar for a CPU function, i.e.

np.empty(shape, dtype)

Is there a clever way how to deal with this without having to write separate functions?

Philipp Eller · Accepted Answer

I have found one dirty workaround for the problem. It is the only way that I could make it work. Use the @myjit decorator instead of @jit and @cuda.jit and allocate all arrays as cuda.local.array.

def myjit(f):
'''
f : function
Decorator to assign the right jit for different targets
In case of non-cuda targets, all instances of `cuda.local.array`
are replaced by `np.empty`. This is a dirty fix, hopefully in the
near future numba will support numpy array allocation and this will
not be necessary anymore
'''
if target == 'cuda':
    return cuda.jit(f, device=True)
else:
    source = inspect.getsource(f).splitlines()
    assert '@myjit' in source[0]
    source = '
'.join(source[1:]) + '
'
    source = source.replace('cuda.local.array', 'np.empty')
    exec(source)
    fun = eval(f.__name__)
    newfun = jit(fun, nopython=True)
    # needs to be exported to globals
    globals()[f.__name__] = newfun
    return newfun

Array allocation for CPU and GPU functions in NUMBA

Answers (1)

Related Questions