Reputation: 2159
I was trying out sycl/dpc++. I have written the below code. I am creating an array deviceArr on device side to which values of hostArr are copied using memcpy and then values of the devicearray are incremented by 1 using a parallel_for kernel and values are copied back with memcpy.
queue q;
std::array<int, 10> hostArr;
for (auto &val : hostArr)
val = 1;
int *deviceArr = malloc_device<int>(10, q);
q.submit([&](handler &h)
{ memcpy(deviceArr, &hostArr[0], 10 * sizeof(int)); });
q.submit([&](handler &h)
{ h.parallel_for(10, [=](auto &idx)
{ deviceArr[idx]++; }); });
q.submit([&](handler &h)
{ memcpy(&hostArr[0], deviceArr, 10 * sizeof(int)); });
This code compiles fine but while running this I get the below error during run time.
**Command group submitted without a kernel or a explicit memory operation. -59 (CL_INVALID_OPERATION)**
However I can see all my queues submitted have either a kernel(parallel_for
) or a memory operation(memcpy
). Can anybody explain why this error occurs?
Upvotes: 0
Views: 269
Reputation: 460
Only the code and functions called from a kernel are seen by the device compiler. This means that your memcpy
is the regular std::memcpy
. SYCL and the device compiler have no way of knowing that you put that here.
To submit your memcpy, you should write instead h.memcpy(...)
! Or use the shorthand q.memcpy()
.
And just to finish, given that you're using USM, you have to take care of the synchronisation. There's no guarantee that the three kernels will be executed in the same order, unless you have a in order queue. You can either wait()
after each submission or use h.depends_on(...)
Upvotes: 2