raaj
raaj

Reputation: 3291

OpenCL: How would one split an existing buffer into two?

Lets say that I happen to allocate some OpenCL memory as such with 200 float values.

cl::Buffer newBuf = cl::Buffer(op::CLManager::getInstance(gpuID)->getContext(), CL_MEM_READ_WRITE, sizeof(float) * 200);

Now I would like to split this cl::Buffer into two objects, one with the first 100 float objects, and another with the subsequent, so that I can pass them into two kernels. I can't find any resource that explains how to do this.

I have no choice because a library that I am using returns me a very large buffer, which I would prefer to split down into smaller cl::Buffers on the CPU side to pass into a Kernel, without incurring memory costs (during splitting)

I tried doing this but it Segfaults:

cl::Buffer newBuf = cl::Buffer(op::CLManager::getInstance(gpuID)->getContext(), CL_MEM_READ_WRITE, sizeof(float) * 200);
cl::Buffer part1 = cl::Buffer((cl_mem)((float*)(newBuf.get())+0),true); // OK
cl::Buffer part2 = cl::Buffer((cl_mem)((float*)(newBuf.get())+100),true); // SEGFAULT

Upvotes: 0

Views: 440

Answers (1)

Ruyk
Ruyk

Reputation: 795

cl_mem objects are not pointers, so you cannot offset them like you would do in CUDA with the pointers returned by cudaMalloc.

If you use OpenCL 2.0 or above, you could try SVM (Shared Virtual Memory), which provides a clSVMAlloc[1] function that returns a pointer (and that you can offset). Note that this mechanism may have some overhead in your implementation.

Another option, which may be better in terms of performance, is to use OpenCL sub-buffers[2]. There are some limitations (e.g, origin must be aligned to some device parameter), so check the underlying device limitations.

[1] https://www.khronos.org/registry/OpenCL/sdk/2.0/docs/man/xhtml/clSVMAlloc.html

[2] https://www.khronos.org/registry/OpenCL/sdk/1.2/docs/man/xhtml/clCreateSubBuffer.html

Upvotes: 2

Related Questions