Jake Ren
Jake Ren

Reputation: 13

why does not clEnequeMapBuffer map to original pointer, OpenCL, Caffe

Assume that a CPU pointer(cpu_ptr_) already exists, then I create a buffer for gpu(cl_gpu_mem_). The problem is that when I map the gpu buffer to a cpu pointer(mapped_ptr), the mapped_ptr is not equal to the original pointer (cpu_ptr_), which causes that CHECK_EQ(mapped_ptr, cpu_ptr_) raises an error.

        cl_gpu_mem_ = clCreateBuffer(ctx.handle().get(),
                          CL_MEM_READ_WRITE | CL_MEM_USE_HOST_PTR,
                          size_, cpu_ptr_, &err);
        void *mapped_ptr = clEnqueueMapBuffer(
                              ctx.get_queue().handle().get(),
                              cl_gpu_mem_,
                              true,
                              CL_MAP_READ | CL_MAP_WRITE,
                              0, size_, 0, NULL, NULL, NULL);
        CHECK_EQ(mapped_ptr, cpu_ptr_)
          << "Device claims it supports zero copy"
          << " but failed to create correct user ptr buffer";

I don't know why this error occurs at all. Would you please give me any advice for this problem, or any solution to it. Thank you very much.

Upvotes: 1

Views: 72

Answers (1)

Tim
Tim

Reputation: 2796

OpenCL implementations are free to mirror the host pointer (making it non-zero copy). On devices that support true zero copy (e.g. Intel GPU), there are still typically constraints that impose whether we can really use that host allocation directory or must mirror it. On Intel the host address must be page aligned and a the length must be a multiple of 128 bytes (an even cacheline). (I typically just page align both.) I am not sure what AMD and other's requirements are.

Look into aligned_alloc or overallocate a couple extra pages via and use a page boundary for the base.

Upvotes: 1

Related Questions