Reputation: 1573
I am currently trying to write an OpenCL application doing some memory intensive calculations. To track the progress of all the calculations I created a for loop which creates different kernel groups. Unfortunately, the calculation fills up my whole memory. My guess is that the kernels are not done executing before the next heap is added.
for (unsigned long i=1; i<maxGlobalThreads; i+=1000000) {
// Calculating Offset
size_t temp = 1000000;
size_t offset = 0;
if (i>1000000) {
offset = i-1000000;
}
cl_event tmp;
clEnqueueNDRangeKernel(command_queue, kernel, 1, &offset, &temp, NULL, NULL, 0, &tmp);
// Wait until Threads finished (-- not working)
clWaitForEvents(1, &tmp);
// Copy results from memory buffer
int *res = (int*)malloc(64*sizeof(int));
int *indexNum = (int*)malloc(14*sizeof(int));
err = clEnqueueReadBuffer(command_queue, foundCombiMem, CL_TRUE, 0, 64*sizeof(int), res, 0, NULL, NULL);
err = clEnqueueReadBuffer(command_queue, indexNumMem, CL_TRUE, 0, 14*sizeof(int), indexNum, 0, NULL, NULL);
// Calculate Time for 1000000 checked combinations
diff = clock() - start;
double msec = diff * 1000 / CLOCKS_PER_SEC;
printf("%f\n", (msec/(i*1000000))*1000000);
[ ... ]
}
Upvotes: 0
Views: 224
Reputation: 5381
In this case the best you can do is to add
clFinish(CommandQueue);
after
clEnqueueNDRangeKernel
Upvotes: 0
Reputation: 2318
You are doing to mallocs that are never freed on each iteration of the loop. This is why you are running out of memory.
Also, your loop is using an unsigned int variable, which could be a problem depending on the value of maxGloablThreads.
Upvotes: 2