Reputation: 31724
I am a Java programmer and new to C++ and Cuda. I am getting a segmentation fault doing below:
Input.h
class Input {
public:
const LucyDecimal * sellingPrice; //Ri
const LucyDecimal qc;
public:
Input(
const LucyDecimal * _sellingPrice,
const LucyDecimal _qc);
virtual ~Input();
};
Input.cpp
Input::Input(
const LucyDecimal * _sellingPrice, //Ri
const LucyDecimal _qc):sellingPrice(_sellingPrice),qc(_qc)
{};
Input::~Input() {
}
Now in my CudaRun.cu
void mainRun(Input in) {
Input *deviceIn;
deviceIn = new Input(NULL, NULL, NULL, NULL, 0.0, NULL,0.0,0.0,NULL,0.0,NULL,0.0);
//line-a
printf("Started. Just abt to call cuda \n");
int size = sizeof(Input);
cudaMalloc((void**) &deviceIn, size);
cudaMemcpy(deviceIn, &in, size, cudaMemcpyHostToDevice);
cudaMalloc((void**) deviceIn->sellingPrice, 4 * sizeof(LucyDecimal));
//line-b
....
}
I get a segmentation fault at line-b
. Has it got to do with line-a
initialization?
Upvotes: 0
Views: 187
Reputation: 151879
You don't get to create a pointer to device memory:
cudaMalloc((void**) &deviceIn, size);
And then dereference that pointer in host code:
cudaMalloc((void**) deviceIn->sellingPrice, 4 * sizeof(LucyDecimal));
To actually set the value of the sellingPrice
pointer within the deviceIn
structure, the compiler must dereference a pointer computed from the base pointer (deviceIn
) in order to write the allocated pointer value, and this dereferencing is illegal in host code.
Copying structures containing pointers is called a "deep copy" and it's somewhat tedious.
Instead you need to allocate a separate pointer:
LucyDecimal * sellingPrice_temp;
cudaMalloc((void**) &sellingPrice_temp, 4 * sizeof(LucyDecimal));
And then copy that allocated pointer from host to device, in the appropriate location:
cudaMemcpy(&(deviceIn->sellingPrice), &sellingPrice_temp, sizeof(LucyDecimal *), cudaMemcpyHostToDevice);
Note that finding the address of a particular location in a structure (&(deviceIn->sellingPrice)
) is something the compiler can compute, without dereferencing the base pointer (deviceIn
).
You will need to use sellingPrice_temp
again if you want to copy the data from that embedded pointer region back to the host at some point.
This topic comes up with some frequency, you can find many other examples if you search on e.g. "CUDA copy structure embedded pointer". The methodology is analogous to copying doubly-subscripted (**
) dynamically allocated matrices from host to device.
I'd also recommend proper cuda error checking although it won't be very instructive in this case.
Upvotes: 3