OpenCL / C++ host code running concurrently and memory maintenance

Question

I am trying to use OpenCL to accelerate certain segments of a pre-existing C++ simulation. Currently, I have selected a loop that runs for 1k-1M iterations on every simulation time-step.

To my current understanding, I have to manually write the data to the kernel using enqeueWriteBuffer to the kernel buffers before calling the kernel. I have to do this every time-step, before the kernel is called, so that the kernel operates on the correct data. Is it possible to make the writing of the data on the buffers to happen synchronously with the existing C++ code?

As it stands, before the kernel is requested, the existing C++ code executes another loop, which takes as long as my memory transfers take. This loop does not change or affect the data that I have to write to the kernel before calling it. Is it possible to get the memory transfer to occur synchronously for this period? I would prefer to have to host running the loop, while also writing the data to the buffers at the same time, saving precious simulation time.

Thanks!

DarkZeros · Accepted Answer

I don't really see a big problem here.

What you simply need is to asynchronously copy the data, while in parallel you perform another operation. That can simply be done with a non-blocking call to clEnqueueWriteBuffer(). Additionally, you can even run the kernel in parallel and keep doing the C++ loop in CPU. There is no need to wait since the data of the kernel is independent from the other C++ loop data.

//Get the data for OpenCL ready
...

//Copy it asynchronously
clEnqueueWriteBufer(queue, buffer, CL_FALSE, size, ptr_to_data);
clFlush(); //Ensure it gets executed (not really needed)
//Run the kernel (asynchronously as well)
clENqueueNDRangeKernel(...);

//Do something else
... 
//Everything that is clEnqueueXXXX is run in the GPU, and does not take CPU cycles.

//Get the data back
clEnqueueReadBufer(...);

//Wait for the GPU to finish
clFinish(...); //Or, by making the read blocking

OpenCL / C++ host code running concurrently and memory maintenance

Answers (2)

Related Questions