Figa17
Figa17

Reputation: 781

Send same data to multiples kernel in OpenCL

I have multiple kernels,in the first of them i send some entries, the output I have from the first kernel is the input for the next. My queue of kernels repeat this behavior 8 times until the last kernel that sends me the real output what I need.

This is an example of what i did:

cl::Kernel kernel1 = cl::Kernel(OPETCL::program, "forward");

//agrego los argumetnos del kernel
kernel1.setArg(0, cl_rowCol);
kernel1.setArg(1, cl_data);
kernel1.setArg(2, cl_x);
kernel1.setArg(3, cl_b);
kernel1.setArg(4, sizeof(int), &largo);

//ejecuto el kernel
OPETCL::queue.enqueueNDRangeKernel(kernel1, cl::NullRange, global, local, NULL, &profilingApp);


/********************************/
/** ejecuto las simetrias de X **/
/********************************/
cl::Kernel kernel2 = cl::Kernel(OPETCL::program, "forward_symmX");


//agrego los argumetnos del kernel
kernel2.setArg(0, cl_rowCol);
kernel2.setArg(1, cl_data);
kernel2.setArg(2, cl_x);
kernel2.setArg(3, cl_b);
kernel2.setArg(4, cl_symmLOR_X);
kernel2.setArg(5, cl_symm_Xpixel);
kernel2.setArg(6, sizeof(int), &largo);

//ejecuto el kernel
OPETCL::queue.enqueueNDRangeKernel(kernel2, cl::NullRange, global, local, NULL, &profilingApp);

OPETCL::queue.finish();

OPETCL::queue.enqueueReadBuffer(cl_b, CL_TRUE, 0, sizeof(float) * lors, b, NULL, NULL);

In this case cl_b is the output what i need.



My question is if the arguments i send to kernels are the same to all kernel, but only one is different.

Is correct what i did to set arguments??
The arguments are keeping in the device during the all kernels execution??

Upvotes: 2

Views: 448

Answers (2)

mfa
mfa

Reputation: 5087

I think you get this behaviour for free, as long as you don't specify CL_QUEUE_OUT_OF_ORDER_EXEC_MODE_ENABLE when you create your command queue.

It looks like you're doing it correctly. In general, this is the process:

  1. create your buffer(s)
  2. queue a buffer copy to the device
  3. queue the kernel execution
  4. repeat #3 for as many kernels as you need to run, passing the buffer as the correct parameter. Use setArg to change/add params. The buffer will still exist on the device -- and modified by the previous kernels
  5. queue a copy of the buffer back to the host

If you do specify CL_QUEUE_OUT_OF_ORDER_EXEC_MODE_ENABLE, you will have to use events to control the execution order of the kernels. This seems unnecessary for your example though.

Upvotes: 0

Vlad
Vlad

Reputation: 39

Since you are using the same queue and OpenCL-context this is OK and your kernels can use the data (arguments) calculated by previous kernel and the data will be kept on the device. I suggest you to use clFinish after each kernel execution to assure the previous kernel finished the calculation, before next one starts. Alternatively, you can use events, to assure that.

Upvotes: 1

Related Questions