Reputation: 1475
My algorithm consists from two steps:
Now first step runs on CPU because it hard to parallelize. I want to run it on GPU because each step of generation takes some time. And I want to run second step for already generated data immediately.
Can I run another opencl kernel from currently runned kernel in separated thread? Or it be run in the some thread that caller kernel?
Some pseudocode for illustrate my point:
__kernel second(__global int * data, int index) {
//work on data[i]. This process takes a lot of time
}
__kernel first(__global int * data, const int length) {
for (int i = 0; i < length; i++) {
// generate data and store it in data[i]
// This kernel will be launched in some thread that caller or in new thread?
// If in same thread, there are ways to launch it in separated thread?
second(data, i);
}
}
Upvotes: 1
Views: 1254
Reputation: 8410
You should launch one kernel. Then do a clFInish(); Then execute the next kernel.
There are more efficient ways but I will only mess you with events.
You just use the memory output of the first kernel as input for the second one. With that, you aboid CPU->GPU copy process.
Upvotes: 3
Reputation: 57
I believe that the global work size might be considered as the number of threads that will be executed, in one way or another. Correct me if I'm wrong.
Upvotes: 0
Reputation: 56347
No, OpenCL has no concept of threads, and neither a kernel execution can launch another kernel. All kernel execution is triggered by the CPU.
Upvotes: 4