Reputation: 235
I got some problems with clCreateBuffer in OpenCL. I am working with an AMD Fusion processor (A10-5800k), so both devices (CPU and GPU) should be able to work on each others memory.
For the read and result buffer I do:
bufRead = clCreateBuffer(context, CL_MEM_READ_ONLY | CL_MEM_USE_HOST_PTR, size, data, &err);
bufWrite = clCreateBuffer(context, CL_MEM_READ_WRITE | CL_MEM_USE_HOST_PTR, size, result, &err);
When I call my kernel, the "result" array doesn't change. I know that normal GPUs would copy the data to the device memory and work on that. Would normal GPUs copy the data back afterwards?
However, I did hope that the Fusion GPU does not copy the data, because it can work on the same pointer. Unfortunately, I don't see any change in the "result" array. When I read "bufWrite" with clEnqueueReadBuffer I see the changes. (I do clFinish before reading "result", so the data should be written)
Does anyone know how to truly work on the same array with CPU and GPU? I really want to avoid clEnqueueReadBuffer.
Thanks,
Tomas
Upvotes: 2
Views: 842
Reputation: 235
OK, I searched quite a while for an answer. It is possible but only under certain circumstances.
You need a GPU that has VM (virtual memory) enabled. You can check this with clinfo. Look for the "VM" in Driver version, e.g.,
Driver version: CAL 1.4.1695 (VM)
I have a quite new APU under Linux and VM is not supported. I think it is not supported for all GPUs under Linux. I will try Windows next. It is plausible because it needs to interact with the OS on this. I hope Linux support will come in the future.
Anyway, to use it, you need to:
check out the AMD APP OpenCL Programming Guide Section 4.5.2 - Placement
Upvotes: 3
Reputation: 9886
On some platforms/devices, a call to clFinish is enough to synchronize host memory from device memory. A call to clEnqueueReadBuffer, or clEnqueueMapBuffer is required in the general case. The pointer returned by clEnqueueMapBuffer should be related to the host ptr you provided when creating the buffer.
Upvotes: 0
Reputation: 489
I'm not sure I understand you. In OpenCL (for any target platform type, CPU or GPU), a call to clCreateBuffer will allocate some memory on the device and will copy data from host pointer to newly allocated memory (althought this copy may done only when a kernel is invoked with this pointer as argument). I do not think it it possible for a host and a device to work on the same memory without "synchronization" (aka clEnqueueReadBuffer).
Upvotes: 0