pablo
pablo

Reputation: 404

Does gpuocelot support dynamic memory allocation in CUDA device?

My algorithm (parallel multi-frontal Gaussian elimination) needs to dynamically allocate memory (tree building) inside CUDA kernel. Does anyone know if gpuocelot supports such things?

According to this: stackoverflow-link and CUDA programming guide I can do such things. But with gpuocelot I get errors during runtime.

Errors:

  1. When I call malloc() inside kernel I get this error:
    (2.000239) ExternalFunctionSet.cpp:371:  Assertion message: LLVM required to call external host functions from PTX.
    solver: ocelot/ir/implementation/ExternalFunctionSet.cpp:371: void ir::ExternalFunctionSet::ExternalFunction::call(void*, const ir::PTXKernel::Prototype&): Assertion false' failed.
  2. When I try to get or set malloc heap size (inside host code):
    solver: ocelot/cuda/implementation/CudaRuntimeInterface.cpp:811: virtual cudaError_t cuda::CudaRuntimeInterface::cudaDeviceGetLimit(size_t*, cudaLimit): Assertion `0 && "unimplemented"' failed.

Maybe I have to point (somehow) to compiler that I want to use device malloc()?

Any advice?

Upvotes: 0

Views: 528

Answers (1)

pablo
pablo

Reputation: 404

You can find the answer in gpu ocelot mailing list:

gpuocelot mailing list link

Upvotes: 1

Related Questions