Tomas
Tomas

Reputation: 235

OpenCL: AMD Fusion and CL_MEM_USE_HOST_PTR

I got some problems with clCreateBuffer in OpenCL. I am working with an AMD Fusion processor (A10-5800k), so both devices (CPU and GPU) should be able to work on each others memory.

For the read and result buffer I do:

bufRead = clCreateBuffer(context, CL_MEM_READ_ONLY | CL_MEM_USE_HOST_PTR, size, data, &err);
bufWrite = clCreateBuffer(context, CL_MEM_READ_WRITE | CL_MEM_USE_HOST_PTR, size, result, &err);

When I call my kernel, the "result" array doesn't change. I know that normal GPUs would copy the data to the device memory and work on that. Would normal GPUs copy the data back afterwards?

However, I did hope that the Fusion GPU does not copy the data, because it can work on the same pointer. Unfortunately, I don't see any change in the "result" array. When I read "bufWrite" with clEnqueueReadBuffer I see the changes. (I do clFinish before reading "result", so the data should be written)

Does anyone know how to truly work on the same array with CPU and GPU? I really want to avoid clEnqueueReadBuffer.

Thanks,

Tomas

Upvotes: 2

Views: 842

Answers (3)

Tomas
Tomas

Reputation: 235

OK, I searched quite a while for an answer. It is possible but only under certain circumstances.

You need a GPU that has VM (virtual memory) enabled. You can check this with clinfo. Look for the "VM" in Driver version, e.g.,

             Driver version: CAL 1.4.1695 (VM) 

I have a quite new APU under Linux and VM is not supported. I think it is not supported for all GPUs under Linux. I will try Windows next. It is plausible because it needs to interact with the OS on this. I hope Linux support will come in the future.

Anyway, to use it, you need to:

  1. Create your buffers with CL_MEM_USE_HOST_PTR or CL_MEM_ALLOC_HOST_PTR.
  2. Access the buffer from the Host with clEnqueueMapBuffer and release it after reading/writing with clEnqueueUnmapMemObject.
  3. When VM is enabled, nothing is copied and you have direct access / without VM it is working as well but it will copy the data.

check out the AMD APP OpenCL Programming Guide Section 4.5.2 - Placement

Upvotes: 3

Eric Bainville
Eric Bainville

Reputation: 9886

On some platforms/devices, a call to clFinish is enough to synchronize host memory from device memory. A call to clEnqueueReadBuffer, or clEnqueueMapBuffer is required in the general case. The pointer returned by clEnqueueMapBuffer should be related to the host ptr you provided when creating the buffer.

Upvotes: 0

GaTTaCa
GaTTaCa

Reputation: 489

I'm not sure I understand you. In OpenCL (for any target platform type, CPU or GPU), a call to clCreateBuffer will allocate some memory on the device and will copy data from host pointer to newly allocated memory (althought this copy may done only when a kernel is invoked with this pointer as argument). I do not think it it possible for a host and a device to work on the same memory without "synchronization" (aka clEnqueueReadBuffer).

Upvotes: 0

Related Questions