username_4567
username_4567

Reputation: 4903

2D arrays in CUDA

I read a lot about handling 2D arrays in CUDA and i think it is necessary to flatten it before sending it to GPU.however can I allocate 1D array on GPU and access it as 2D array in GPU?I tried but failed my code looks like follows:

__global__ void kernel( int **d_a )
{ 

   cuPrintf("%p",local_array[0][0]);
}

int main(){

    int **A;

    int i;

    cudaPrintfInit();

    cudaMalloc((void**)&A,16*sizeof(int));

    kernel<<<1,1>>>(A);

    cudaPrintfDisplay(stdout,true);

    cudaPrintfEnd();
}

Upvotes: 3

Views: 2659

Answers (2)

username_4567
username_4567

Reputation: 4903

This is how I fixed problem I cudaMalloc in usual way but while sending pointer to kernel i'm typecasting it to int(*)[col],and this is working for me

Upvotes: 0

limes
limes

Reputation: 638

In fact it is not necessary to "flatten" your 2D array before using it on the GPU (although this can speed up memory accesses). If you'd like a 2D array, you can use something like cudaMallocPitch, which is documented in the CUDA C programming guide. I believe the reason your code isn't working is because you only malloced a 1D array - A[0][0] doesn't exist. If you look at your code, you made a 1D array of ints, not int*s. If you wanted to malloc a flattened 2D array, you could do something like:

int** A;
cudaMalloc(&A, 16*length*sizeof(int*)); //where length is the number of rows/cols you want

And then in your kernel use (to print the pointer to any element):

__global__ void kernel( int **d_a, int row, int col, int stride )
{ 
  printf("%p", d_a[ col + row*stride ]);
}

Upvotes: 2

Related Questions