Reputation: 63
My source code of simple C++ cuda code
#include <iostream>
#include <cuda.h>
using namespace std;
__global__ void AddIntsCUDA(int *a, int *b, int *c)
{
*c = *a + *b;
}
int main()
{
int a, b, c;
int *d_a, *d_b, *d_c;
int size = sizeof(int);
cudaMalloc((void **)&d_a, size);
cudaMalloc((void **)&d_b, size);
cudaMalloc((void **)&d_c, size);
a = 10;
b = 35;
c = 0;
cudaMemcpy(d_a, &a, size, cudaMemcpyHostToDevice);
cudaMemcpy(d_b, &b, size, cudaMemcpyHostToDevice);
AddIntsCUDA<<<1, 1>>>(d_a, d_b, d_c);
cudaMemcpy(&c, d_c, size, cudaMemcpyDeviceToHost);
cout << "The Answer is "<< c << endl;
cudaFree(d_a);
cudaFree(d_b);
cudaFree(d_c);
system("pause");
return 0;
}
Console Output output shows c = 0 but i expect sum of a and b output (should like this 45 because a = 10, b = 35) explain me what the hell is happening in this code
Upvotes: 3
Views: 6363
Reputation: 952
Try adding a cudaError_t err = cudaDeviceSynchronize();
after the kernel launch and before the copy. And print the value of err
.
Use const char* cudaGetErrorString ( cudaError_t error )
to get the error string at runtime, or look here:
https://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__TYPES.html#group__CUDART__TYPES_1gf599e5b8b829ce7db0f5216928f6ecb6
Following your comment that it's error number 35, it seems that you need to update your driver.
Upvotes: 2