How to parameterize the size of cuda.local.array in Numba?

Question

I want to allocate a small local array in a Numba CUDA kernel. However, I find that it does not allow parameterized array size. Only a constant size is allowed. How can I solve this?

import numba

# This works, but it has to hard code the array size
@cuda.jit
def kernel1():
    arr = numba.cuda.local.array(3, dtype=numba.float32)

kernel1[2,2]()


# I want this, but it does not work
@cuda.jit
def kernel2(dim):
    arr = numba.cuda.local.array(dim, dtype=numba.float32)

kernel2[2,2](3)

Below is the error message

TypingError: Failed in cuda mode pipeline (step: nopython frontend)
No implementation of function Function() found for signature:
 
 >>> array(int64, dtype=class(float32))
 
There are 2 candidate implementations:
  - Of which 2 did not match due to:
  Overload of function 'array': File: numba/cuda/cudadecl.py: Line 44.
    With argument(s): '(int64, dtype=class(float32))':
   No match.

During: resolving callee type: Function()
During: typing of call at /tmp/ipykernel_18276/1701838372.py (3)


File "../../../../../tmp/ipykernel_18276/1701838372.py", line 3:

How to parameterize the size of cuda.local.array in Numba?

Answers (1)

Related Questions