Reputation: 21
I'm developing a program that implements a recursive ray tracing in OpenCL. To run the kernel I have to options of devices: the Intel one that is integrated with the system and the Nvidia GeForce graphic Card.
When I run the project with the first device there's no problem; it runs correctly and shows the result of the algorithm just fine.
But when I try to run it with the Nvidia device, it crashes in the callback function that has the synchronous buffer map.
The part of the code where it crashes is the following:
clEnqueueNDRangeKernel( queue, kernel, 1, NULL, &global_work_size, NULL, 0, NULL, NULL);
// 7. Look at the results via synchronous buffer map.
cl_float4 *ptr = (cl_float4 *) clEnqueueMapBuffer( queue, buffer, CL_TRUE, CL_MAP_READ, 0, kWidth * kHeight * sizeof(cl_float4), 0, NULL, NULL, NULL );
cl_float *viewTransformPtr = (cl_float *) clEnqueueMapBuffer( queue, viewTransform, CL_TRUE, CL_MAP_WRITE, 0, 16 * sizeof(cl_float), 0, NULL, NULL, NULL );
cl_float *worldTransformsPtr = (cl_float *) clEnqueueMapBuffer( queue, worldTransforms, CL_TRUE, CL_MAP_WRITE, 0, 16 * sizeof(cl_float), 0, NULL, NULL, NULL );
memcpy(viewTransformPtr, viewMatrix, sizeof(float)*16);
memcpy(worldTransformsPtr, sphereTransforms, sizeof(float)*16);
clEnqueueUnmapMemObject(queue, viewTransform, viewTransformPtr, 0, 0, 0);
clEnqueueUnmapMemObject(queue, worldTransforms, worldTransformsPtr, 0, 0, 0);
unsigned char* pixels = new unsigned char[kWidth*kHeight*4];
for(int i=0; i < kWidth * kHeight; i++){
pixels[i*4] = ptr[i].s[0]*255;
pixels[i*4+1] = ptr[i].s[1]*255;
pixels[i*4+2] = ptr[i].s[2]*255;
pixels[i*4+3] = 1;
}
glBindTexture(GL_TEXTURE_2D, 1);
glTexParameterf( GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_LINEAR );
glTexParameterf( GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_LINEAR );
glTexImage2D(GL_TEXTURE_2D, 0, 4, kWidth, kHeight, 0, GL_RGBA, GL_UNSIGNED_BYTE, pixels);
delete [] pixels;
The two last calls to clEnqueueMapBuffer return the error -5 that matches CL_OUT_OF_RESOURCES but I believe that the sizes of the buffers are correct.
Upvotes: 1
Views: 863
Reputation: 8410
According to the CL spec, calling CL blocking calls from a callback is undefined. It is likely your code is correct, but you can't use it from a Callback. In Intel platform with integrated memory, the maps are no-ops, thus, not failing.
The behavior of calling expensive system routines, OpenCL API calls to create contexts or command-queues, or blocking OpenCL operations from the following list below, in a callback is undefined.
clFinish clWaitForEvents blocking calls to clEnqueueReadBuffer, clEnqueueReadBufferRect, clEnqueueWriteBuffer, and clEnqueueWriteBufferRect blocking calls to clEnqueueReadImage and clEnqueueWriteImage blocking calls to clEnqueueMapBuffer and clEnqueueMapImage blocking calls to clBuildProgram
If an application needs to wait for completion of a routine from the above l ist in a callback, please use the non-blocking form of the function, and assign a completion callback to it to do the remainder of your work. Note that when a callback (or other code) enqueues commands to a command-queue, the commands are not required to begin execution until the queue is flushed. In standard usage, blocking enqueue calls serve this role by implicitly flushing the queue. Since blocking calls are not permitted in callbacks, those callbacks that enqueue commands on a command queue should either call clFlush on the queue before returning or arrange for clFlush to be called later on another thread.
Upvotes: 0