CODEWITHSUNDEEP

Reputation: 31

cuda file error "Invalid device function"

I have a GPU card GeForce GTX 295 and visual studio 2012 and cuda with version 6.5. I run a simple code like

#include "stdafx.h" 
#include <stdio.h> 
#include <cuda.h> 
// Kernel that executes on the CUDA device
 __global__ void square_array(float *a, int N)
 { 
  int idx = blockIdx.x * blockDim.x + threadIdx.x; 
  if (idx<N) a[idx] = a[idx] * a[idx]; } 
 // main routine that executes on the host
 int main(void)
 {   float *a_h, *a_d;  // Pointer to host & device arrays   
const int N = 10;  // Number of elements in arrays   
size_t size = N * sizeof(float);  
 a_h = (float *)malloc(size);        // Allocate array on host   
cudaMalloc((void **) &a_d, size);   // Allocate array on device   // Initialize host array and copy it to CUDA device  
 for (int i=0; i<N; i++) a_h[i] = (float)i;   
cudaMemcpy(a_d, a_h, size, cudaMemcpyHostToDevice);   // Do calculation on device:   
int block_size = 4;  
 int n_blocks = N/block_size + (N%block_size == 0 ? 0:1);   
square_array <<< n_blocks, block_size >>> (a_d, N);  

// Retrieve result from device and store it in host array   
cudaMemcpy(a_h, a_d, sizeof(float)*N, cudaMemcpyDeviceToHost); 
  // Print results  
 for (int i=0; i<N; i++) 
printf("%d %f\n", i, a_h[i]);  
 // Cleanup  
 free(a_h); 
cudaFree(a_d); }

In this code ,when I use command cudaGetLastError (void) after calling the kernel, at console window an error display "Invalid device function" .How can I get rid of it? Sample codes of cuda kit 6.5 are being run successfully with visual studio 2012.enter code here

Upvotes: 3

Views: 10330

Answers (1)

Reputation: 5705

GTX 295 has compute capability 1.3 I believe. It may be worth checking your solution compiler settings to see whether you are not compiling the solution using something like compute_20,sm_20. If so, try to change these values to e.g. compute_10,sm_10, rebuild and see whether it helps. See here for details on setting these values.

EDIT:

According to njuffa and also CUDA documentation support for cc1.0 devices was removed in CUDA 6.5 so you'll have to use compute_13,sm_13.

Upvotes: 3

Related Questions