Reputation: 147
I'm working on porting a CUDA application to OpenCL and I noticed that CUDA offers the functionality of writing data to its "buffers" by chunks. What I mean by this is the following:
int *vals = new int[N/4];
int *d_vec = nullptr;
cudaMalloc((void**)&d_vec, sizeof(int) * N);
for(int i = 0; i < 4; i++){
cudaMemcpy(d_vec + i*(N/4), vals, sizeof(int) * N/4, cudaMemcpyHostToDevice);
}
What the code above does is to write the vals
array (which has 1/4 of the d_vec
buffer) sequentially to d_vec
. So my question is, is it possible to do the same with OpenCL? That is, allocate a buffer and write values sequentially to it, without having to write an array with the full buffer size?
Thank you!
Upvotes: 0
Views: 100
Reputation: 1229
Yes, you can specify a size and offset for clEnqueueWriteBuffer, which would be your replacement for the cudaMemcpy
.
cl_int clEnqueueWriteBuffer(
cl_command_queue command_queue,
cl_mem buffer,
cl_bool blocking_write,
size_t offset, // from your example: i*(N/4)
size_t size, // from your example: sizeof(int) * N/4
const void* ptr,
cl_uint num_events_in_wait_list,
const cl_event* event_wait_list,
cl_event* event);
Upvotes: 2
Reputation: 5746
Yes, this is indeed possible with enqueueWriteBuffer
:
cl_int *vals = new cl_int[N/4];
Buffer d_vec;
d_vec = Buffer(context, CL_MEM_READ_WRITE, N*sizeof(cl_int));
for(int i = 0; i < 4; i++){
queue.enqueueWriteBuffer(d_vec, true, i*(N/4), sizeof(cl_int)*N/4, (void*)vals);
queue.finish();
}
Upvotes: 2