I.B.
I.B.

Reputation: 30

OpenCL global_id starts count from 1 instead of 0. Xeon Phi

I am a novice at OpenCL and recently I have stumbled onto something which does not make sense to me.

I am using Intel drivers (working on linux machine) and the device is Xeon Phi coprocessor.

The problem is that when I give local_item_size as an argument to

clEnqueueNDRangeKernel(commandQueue,
                       forceKernel, 1, 
                       &localItemSize, &globalItemSize,
                       NULL, 0, NULL, &kernelDone); 

and when printing global thread id in the kernel

int tid = get_global_id(0); 

The thread ids start from 1 and not from 0.

When I do not describe what my local_item_size and have NULL as an argument it seems to start counting correctly from 0.

At the moment I am fixing this in my code by subtracting 1 from the return value of get_global_id(0) for my code to work correctly..

Shortly: When I say what my local_item_size is the tid starts from 1. When I give NULL it starts from 0.

Size setting code:

// Global item size 
if (n <= NUM_THREADS) { 
  globalItemSize = NUM_THREADS; 
  localItemSize = 16; 
} else if (n % NUM_THREADS != 0) { 
  globalItemSize = (n / NUM_THREADS + 1) * NUM_THREADS; 
} else { 
  globalItemSize = n; 
} 

// Local item size 
localItemSize = globalItemSize / NUM_THREADS;

Upvotes: 0

Views: 284

Answers (2)

Oak
Oak

Reputation: 26868

The 4th parameter to clEnqueueNDRangeKernel is an array of the offsets, not the local size - that's the 6th parameter. Your call should be

clEnqueueNDRangeKernel(commandQueue,
                       forceKernel, 1, 
                       NULL, &globalItemSize,
                       &localItemSize, 0, NULL, &kernelDone);

This is also why the IDs started at 1 - because you requested an offset of 1!

Upvotes: 3

jprice
jprice

Reputation: 9925

You are passing your work-group size to the wrong argument. The third argument of clEnqueueNDRangeKernel is the global work offset, which is why your global IDs are appearing offset. The work-group size should go to the sixth argument:

clEnqueueNDRangeKernel(commandQueue,
                       forceKernel, 1, NULL,
                       &globalItemSize, &localItemSize,
                       0, NULL, &kernelDone); 

Upvotes: 3

Related Questions