Reputation: 393
I wrote a CUDA code using shared memory like this:
__global__ void matrix_mul_shared(float *ad,float *bd,float *cd,int N)
{
float pvalue=0;
int TILE=blockDim.x;
int ty=threadIdx.y;
int tx=threadIdx.x;
//allocate shared memory per block
__shared__ float ads[1][1];
__shared__ float bds[1][1];
.
. . }
This code works , but the following code fails;
__global__ void matrix_mul_shared(float *ad,float *bd,float *cd,int N)
{
float pvalue=0;
int TILE=blockDim.x;
int ty=threadIdx.y;
int tx=threadIdx.x;
//allocate shared memory per block
__shared__ float ads[TILE][TILE];
__shared__ float bds[TILE][TILE];
.
.
.
}
The compiler is expecting something constant at the lines where I am allocating shared memory. It says(I forgot the exact error but it is something like this):
The parameters should be a constant
I was able to use printf and print the value of TILE, and it is coming out 1. so why this error?
Upvotes: 0
Views: 227
Reputation: 21475
I think the error you are receiving is
error: expression must have a constant value
The variable TILE
is not a constant in the sense meant by the compiler. The compiler is asking something known at compile time as dimension of your shared memory array.
A possible solution:
#define TILE 16
__global__ void matrix_mul_shared(float *ad,float *bd,float *cd,int N)
{
...
__shared__ float ads[TILE][TILE];
...
}
Upvotes: 1