Arjun Mehta
Arjun Mehta

Reputation: 2552

How to cudaMemcpy a __device__ initialized var

I have some working code... where I allocate a device variable pointer as follows:

float *d_var;
cudaMalloc(&d_var, sizeof(float) );

Later on in my code, I want to copy the contents of this var to a local var (ref):

checkCudaErrors(cudaMemcpy(&h_var, &d_var, sizeof(float), cudaMemcpyDeviceToHost));

Which works great! But using cudaMalloc is slow!


So I want to instead allocate the variable without using cudaMalloc using a __device__ definition:

__device__ float d_var = 1000000000.0f;

This works great and I know the d_var in this case is initialized properly and I can do all my work with it like before. I've been printf'ing its contents, so I know it has the right contents. But when I try to copy the contents now to my host var using the same code as before:

checkCudaErrors(cudaMemcpy(&h_var, &d_var, sizeof(float), cudaMemcpyDeviceToHost));

I get a really vague error:

invalid argument cudaMemcpy(&h_var, &d_var, sizeof(float), cudaMemcpyDeviceToHost)

I've tried referring to the variable as &d_var, d_var, *d_var to no avail. Any help MUCH appreciated.

Thanks!

Upvotes: 2

Views: 641

Answers (1)

Arjun Mehta
Arjun Mehta

Reputation: 2552

Bah, I figured it out.... Instead of calling cudaMemcpy(); I need to use cudaMemcpyFromSymbol();

checkCudaErrors(cudaMemcpyFromSymbol(&h_var, d_var, sizeof(float), 0, cudaMemcpyDeviceToHost));

Upvotes: 4

Related Questions