Reputation: 1101
I'm taking an online parallel programming course. The homework is done within a virtual machine on their site. My first assignment (below) ran as it should. squaring numbers from 0 to ARRAY_SIZE. When I try to run it on my machine. I get some strange values. I can't find anything wrong with the code. Any suggestions? (output on my machine posted below).
And yes I am aware that my kernel is called cube despite the fact that I am only squaring the number. I just never changed it.
#include <stdio.h>
__global__ void cube( float* d_in, float* d_out ){
int idx = threadIdx.x;
float f = d_in[idx];
d_out[idx] = f*f;
}
int main(){
const int ARRAY_SIZE = 8;
const int ARRAY_BYTES = ARRAY_SIZE * sizeof(float);
// Host memory
float h_in[ARRAY_SIZE];
float h_out[ARRAY_SIZE];
for( int i = 0; i < ARRAY_SIZE; i++ )
h_in[i] = (float)i;
// Device memory pointers
float* d_in;
float* d_out;
// Allocate device memory
cudaMalloc( (void**) &d_in, ARRAY_BYTES );
cudaMalloc( (void**) &d_out, ARRAY_BYTES );
// Transfer input to device
cudaMemcpy( d_in, h_in, ARRAY_BYTES, cudaMemcpyHostToDevice );
// Launch the kernel
cube<<<1,ARRAY_SIZE>>>(d_out,d_in);
// Transfer device to host
cudaMemcpy( h_out, d_out, ARRAY_BYTES, cudaMemcpyDeviceToHost );
for(int i = 0; i < ARRAY_SIZE; i++)
printf("%f\n",h_out[i]);
// Free memory
cudaFree(d_in);
cudaFree(d_out);
return 0;
}
output posted below
dan@mojo:~/Dropbox/code/gpu_programming$ nvcc -o first first.cu
dan@mojo:~/Dropbox/code/gpu_programming$ ./first
-0.000000
-nan
-0.000000
-nan
-0.000000
nan
-nan
-nan
Upvotes: 1
Views: 83
Reputation: 2660
Switch the order of the parameters when launching the kernel, i.e.
cube<<<1,ARRAY_SIZE>>>(d_in, d_out);
Upvotes: 2