anupshrestha
anupshrestha

Reputation: 236

Setting an array in device memory with a pointer to struct; in cuda

I am trying to initialize an array in memory with pointer to a struct that I create inside a kernel. Here is the code I have so far I don't know what I am doing wrong. I get a segmentation fault if I try to do a cudaMalloc on each item in the array, if I don't then I get an "unspecified launch failure" error.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

typedef struct {
    int status;
    int location;
    double distance;
} Point;

//Macro for checking cuda errors following a cuda launch or api call
#define cudaCheckError() {\
 cudaError_t e=cudaGetLastError();\
 if(e!=cudaSuccess) {\
   printf("\nCuda failure %s:%d: '%s'\n",__FILE__,__LINE__,cudaGetErrorString(e));\
   exit(0); \
 }\
}

__global__ void kernel1(Point** d_memory, int limit){

  int idx = blockIdx.x * blockDim.x * blockDim.y * blockDim.z 
  + threadIdx.z * blockDim.y * blockDim.x 
  + threadIdx.y * blockDim.x + threadIdx.x;

    if(idx < limit) {

        Point* pt = ( Point *) malloc( sizeof(Point) );
        pt->distance = 10;
        pt->location = -1;
        pt->status = -1;

        d_memory[idx] = pt;
    }
}

__global__ void kernel2(Point** d_memory, int limit){
  int i;
  for (i=0; i<limit;i++){
    printf("%f \n",d_memory[i]->distance);
  }
}

int main(int argc, char *argv[])
{
    int totalGrid = 257*193*129;
    size_t size = sizeof(Point) * totalGrid;
    Point ** d_memory;
    cudaMalloc((void **)&d_memory, size);
    /*
    for(int i=0; i<totalGrid; i++){
        printf("%d\n",i);
        cudaMalloc((void **)&d_memory[i], sizeof(Point));
    }*/
    dim3 bs(16,8,8);
    kernel1<<<6249, bs>>>(d_memory, totalGrid);
    cudaCheckError();

    cudaDeviceSynchronize();

    kernel2<<<1,1>>>(d_memory, totalGrid);
    cudaCheckError();

    cudaFree(d_memory);
    return 0;
}

This is what I used for compiling the code

 nvcc -arch=sm_20 test.cu

Upvotes: 4

Views: 542

Answers (1)

Iharob Al Asimi
Iharob Al Asimi

Reputation: 53016

I believe your problem is

Point **d_memory;

it should be

Point *d_memory;

and you should not need the cast to void **, you need it in your code because your pointer as passed is a Point *** and not Point **.

Note that cudaMalloc() will allocate contiguous memory, the Point ** suggests that you want an array of pointers, for which I believe you need something like

Point **d_memory;
cudaMalloc((void **)&d_memory, rows);
for (row = 0 ; row < rows ; ++row)
    cudaMalloc(&d_memory[row], columns * sizeof(Point));

But then, you will need to check that the other objects that take d_memory as a parameter, will treat d_memory accordingly.

Also, cudaMalloc() returns cudaSuccess when the allocation was sucessful, you never check for that.

Upvotes: 2

Related Questions