Reputation: 53
unsigned int tid = threadIdx.x + threadIdx.y * blockDim.x;
unsigned int warpid = tid / warpSize;
This is good for blocks with more than one dimension right?
int i = blockIdx.x * blockDim.x + threadIdx.x;
int j = blockIdx.y * blockDim.y + threadIdx.y;
int k = blockIdx.z * blockDim.z + threadIdx.z;
int tid = i + j + k;
int l = tid / warpSize;
Is that right? I know this doesn't make sense...
Upvotes: 1
Views: 1356
Reputation: 1507
To compute thread id (block unique) and respect the way how the threads are organized on GPU :
int tid = threadIdx.x +
threadIdx.y * blockDim.x +
threadIdx.z * blockDim.x * blockDim.y;
To compute warp id:
int warpid = tid / 32;
By this, threads with the same warpid
belong to the same warp.
EDIT:
As it is stated above this computation of tid
is only block unique. You can not use it to access unique index of the arrays A, B, C
. Similarly, the variable warpid
stores only block unique index of warp.
The reason why your approach works for one dimensional blocks and grid is that the expression int i = blockIdx.x * blockDim.x + threadIdx.x;
calculates grid unique thread id and variables j
and k
equal zero.
So, computation of identifiers depends on the purpose as well as on the dimensions of blocks and grid. For example, if you want to compute global identifiers of threads to access some global memory arrays with 1D grid and 3D blocks:
int tid = threadIdx.x +
threadIdx.y * blockDim.x +
threadIdx.z * blockDim.x * blockDim.y +
blockIdx.x * blockDim.x * blockDim.y * blockDim.z; // number of threads in prior blocks
and the calculation of the warp identifier depends on the purpose (whether you are interested in global or block unique id) as well.
Upvotes: 2