How would you calculate warpid across all possible block dimensions?

Question

unsigned int tid = threadIdx.x + threadIdx.y * blockDim.x;
unsigned int warpid = tid / warpSize;

This is good for blocks with more than one dimension right?

    int i = blockIdx.x * blockDim.x + threadIdx.x;
    int j = blockIdx.y * blockDim.y + threadIdx.y;
    int k = blockIdx.z * blockDim.z + threadIdx.z;

    int tid = i + j + k;

    int l = tid / warpSize;

Is that right? I know this doesn't make sense...

stuhlo · Accepted Answer

To compute thread id (block unique) and respect the way how the threads are organized on GPU :

int tid = threadIdx.x + threadIdx.y * blockDim.x + threadIdx.z * blockDim.x * blockDim.y;

To compute warp id: int warpid = tid / 32;

By this, threads with the same warpid belong to the same warp.

EDIT:

As it is stated above this computation of tid is only block unique. You can not use it to access unique index of the arrays A, B, C. Similarly, the variable warpid stores only block unique index of warp.

The reason why your approach works for one dimensional blocks and grid is that the expression int i = blockIdx.x * blockDim.x + threadIdx.x; calculates grid unique thread id and variables j and k equal zero.

So, computation of identifiers depends on the purpose as well as on the dimensions of blocks and grid. For example, if you want to compute global identifiers of threads to access some global memory arrays with 1D grid and 3D blocks:

int tid = threadIdx.x + threadIdx.y * blockDim.x + threadIdx.z * blockDim.x * blockDim.y + blockIdx.x * blockDim.x * blockDim.y * blockDim.z; // number of threads in prior blocks

and the calculation of the warp identifier depends on the purpose (whether you are interested in global or block unique id) as well.

How would you calculate warpid across all possible block dimensions?

Answers (1)

Related Questions