Reputation: 256
I am doing a simple n-body simulation on CUDA, which then I am trying to visualize with OpenGL.
After I have initialitzed my particle data on the CPU, allocated the respective memory and transfered that data on the GPU, the program has to enter the following cycle:
1) Compute the forces on each particle (CUDA part)
2) update particle positions (CUDA part)
3) display the particles for this time step (OpenGL part)
4) go back to 1)
The interface between CUDA and OpenGL I achieve with the following code:
GLuint dataBufferID;
particle_t* Particles_d;
particle_t* Particles_h;
cudaGraphicsResource *resources[1];
I allocate space on OpenGLs Array_Buffer and register the latter as a cudaGraphicsResource using the following code:
void createVBO()
{
// create buffer object
glGenBuffers(1, &dataBufferID);
glBindBuffer(GL_ARRAY_BUFFER, dataBufferID);
glBufferData(GL_ARRAY_BUFFER, bufferStride*N*sizeof(float), 0, GL_DYNAMIC_DRAW);
glBindBuffer(GL_ARRAY_BUFFER, 0);
checkCudaErrors(cudaGraphicsGLRegisterBuffer(resources, dataBufferID, cudaGraphicsMapFlagsNone));
}
Lastly, the program cycle that I described (steps 1 to 4) is realized by the following function update(int)
void update(int value)
{
// map OpenGL buffer object for writing from CUDA
float* dataPtr;
checkCudaErrors(cudaGraphicsMapResources(1, resources, 0));
size_t num_bytes;
//get a pointer to that buffer object for manipulation with cuda!
checkCudaErrors(cudaGraphicsResourceGetMappedPointer((void **)&dataPtr, &num_bytes,resources[0]));
//fill the Graphics Resource with particle position Data!
launch_kernel<<<NUM_BLOCKS,NUM_THREADS>>>(Particles_d,dataPtr,1);
// unmap buffer object
checkCudaErrors(cudaGraphicsUnmapResources(1, resources, 0));
glutPostRedisplay();
glutTimerFunc(milisec,update,0);
}
I compile end I get the following errors :
CUDA error at src/main.cu:390 code=4(cudaErrorLaunchFailure) "cudaGraphicsMapResources(1, resources, 0)"
CUDA error at src/main.cu:392 code=4(cudaErrorLaunchFailure) "cudaGraphicsResourceGetMappedPointer((void **)&dataPtr, &num_bytes,resources[0])"
CUDA error at src/main.cu:397 code=4(cudaErrorLaunchFailure) "cudaGraphicsUnmapResources(1, resources, 0)"
Does anyone know what might be the reasons for that exception? Am I supposed to create the dataBuffer using createVBO() every time prior to the execution of update(int) ...?
p.s. Just for more clarity, my kernel function is the following:
__global__ void launch_kernel(particle_t* Particles,float* data, int KernelMode){
int i = blockIdx.x*THREADS_PER_BLOCK + threadIdx.x;
if(KernelMode == 1){
//N_d is allocated on device memory
if(i > N_d)
return;
//and update dataBuffer!
updateX(Particles+i);
for(int d=0;d<DIM_d;d++){
data[i*bufferStride_d+d] = Particles[i].p[d]; // update the new coordinate positions in the data buffer!
}
// fill in also the RGB data and the radius. In general THIS IS NOT NECESSARY!! NEED TO PERFORM ONCE! REFACTOR!!!
data[i*bufferStride_d+DIM_d] =Particles[i].r;
data[i*bufferStride_d+DIM_d+1] =Particles[i].g;
data[i*bufferStride_d+DIM_d+2] =Particles[i].b;
data[i*bufferStride_d+DIM_d+3] =Particles[i].radius;
}else{
// if KernelMode = 2 then Update Y
float* Fold = new float[DIM_d];
for(int d=0;d<DIM_d;d++)
Fold[d]=Particles[i].force[d];
//of course in parallel :)
computeForces(Particles,i);
updateV(Particles+i,Fold);
delete [] Fold;
}
// in either case wait for all threads to finish!
__syncthreads();
}
Upvotes: 0
Views: 1512
Reputation: 256
As I mentioned at one of the comments above , it turned out that I had mistaken the computing capability compiler option. I ran cuda-memcheck and I saw that that the cuda Api launch was failing. After I found the right compiler options, everything worked like a charm.
Upvotes: 1