Reputation: 955
I have the following (simplified) code in my .cu file
typedef struct
{
int leg_id;
int contract_id;
} CudaLeg;
typedef struct
{
CudaLeg* legs;
int no_legs;
} CudaPortfolio;
extern "C"
__global__ void kernel(CudaPortfolio* o_portfolios_cuda, const int no_portfolios)
{
// fill o_portfolios_cuda with data
}
void cudaFunction(CudaPortfolio* o_portfolios, unsigned long long no_portfolios)
{
CudaPortfolio* o_portfolios_cuda;
cudaMalloc((void **)& o_portfolios_cuda, sizeof(CudaPortfolio) * no_portfolios);
kernel<<<32, 32>>>(o_portfolios_cuda, no_portfolios);
cudaMemcpy(o_portfolios, o_portfolios_cuda, sizeof(CudaPortfolio) * no_portfolios, cudaMemcpyDeviceToHost);
//printf below works
printf("CPU no legs strike output portfolio: %d\n", o_portfolios[0].no_legs);
//printf below crashes the program
printf("CPU Leg 1 multiplier output portfolio: %d\n", o_portfolios[0].legs[0].multiplier);
cudaFree(o_portfolios_cuda);
}
The GPU is a GTX580, sm2.0. The GPU can work fine with o_portfolios_cuda and fill it with data and do calculations with it. The first printf of o_portfolios[0].no_legs gives back the correct function. But when I try to access some of the portfolios legs (o_portfolios[0].legs[0].multiplier) the program crashes. Any ideas how I can fix this? Thank you.
@Robert Crovella I already tried something like that, but it didn't work. I tried it again and added
CudaLeg* o_portfolios_legs_cuda;
cudaMalloc((void **)& o_portfolios_legs_cuda, sizeof(CudaLeg));
cudaMemcpy(o_portfolios_legs_cuda, o_portfolios->legs, sizeof(CudaLeg), cudaMemcpyHostToDevice);
cudaMemcpy(&(o_portfolios_cuda->legs), &o_portfolios_legs_cuda, sizeof(CudaLeg *), cudaMemcpyHostToDevice);
But now the program crashes on the 3rd line I just added (cudaMemcpy(o_portfolios_legs_cuda, ...)
@MarkoR The CudaLeg objects don't have a fixed number.
Upvotes: 0
Views: 1555
Reputation: 9779
When you are copying a pointer legs
, pointing to some device memory address, to host memory, you need also change the pointer to a host address, which stores the host copy of the original device data.
Upvotes: 0
Reputation: 553
You are allocating the space for CudaPortfolio struct, which has one int and one CudaLeg pointer. But you are not allocating the space for what that CudaLeg points to. So when you try to access it, it crashes.
How to fix it: If you are only going to have 1 CudaLeg, you can drop the pointer and just have CudaLeg leg inside CudaPortfolio. If you are going to have fixed number of CudaLeg objects, you can have for example "CudaLeg[5] legs" inside CudaPortfolio. If you don't have fixed number of CudaLeg objects, and want to keep it as it is right now, you need to do additional malloc for number of legs and to assign it to each portfolio. See the link that Robert Crovella posted in the comment to see how it is done.
Upvotes: 1