Kristian D'Amato
Kristian D'Amato

Reputation: 4046

CUDA & VS2010 problem

I have scoured the internets looking for an answer to this one, but couldn't find any. I've installed the CUDA 3.2 SDK (and, just now, CUDA 4.0 RC) and everything seems to work fine after long hours of fooling around with include directories, NSight, and all the rest. Well, except this one thing: it keeps highlighting the <<< >>> operator as a mistake. Only on VS2010--not on VS2008.

On VS2010 I also get several warnings of the following sort:

C:\Program Files (x86)\Microsoft Visual Studio 9.0\VC\include\xdebug(109): warning C4251: 'std::_String_val<_Ty,_Alloc>::_Alval' : class 'std::_DebugHeapAllocator<_Ty>' needs to have dll-interface to be used by clients of class 'std::_String_val<_Ty,_Alloc>'

Update: If I try and include an entry point in a .cpp file that calls a CUDA kernel, instead of writing main() in a .cu file as I was doing, the operator is actually flagged as an error, besides highlighting it! The same thing happens with VS2008.

Anyone know how this can be fixed?

Update 2: Here is the code. The main.cpp file:

#include ""

int main()
    return 0;

and the .cu file:

#include <iostream>
#include "cuda.h"
#include "cuda_runtime.h"
#include "device_launch_parameters.h"
#include <cutil_inline.h>
#include <time.h>

using namespace std;

#define N 16

__global__ void MatAdd(float A[N][N], float B[N][N], float C[N][N])

    int i = blockIdx.x * blockDim.x + threadIdx.x;
int j = blockIdx.y * blockDim.y + threadIdx.y;

if (i < N && j < N)
    C[i][j] = A[i][j] + B[i][j];

int doStuff()
    dim3 threadsPerBlock(8, 8);
    dim3 numBlocks(N / threadsPerBlock.x, N / threadsPerBlock.y);

    float A[N][N], B[N][N], C[N][N];

    for (int i = 0; i < N; ++i)
        for (int j = 0; j < N; ++j)
            A[i][j] = 0;
            B[i][j] = 0;
            C[i][j] = 0;

    clock_t start = clock();
    MatAdd<<<numBlocks, threadsPerBlock>>>(A, B, C);
    clock_t end = clock();

    cout << "Took " << float(end - start) << "ms to work out." << endl;

    return 0;

Update 3: Alright, I was (idiotically) including the CUDA code in the .cpp file, so of course it couldn't compile. Now I have CUDA 4.0 up and running on VS2010, but I still get several warnings of the kind explained above.

Upvotes: 0

Views: 2806

Answers (2)

Ade Miller
Ade Miller

Reputation: 13753

You cannot do this...

#include "" 

Now you're asking the Visual Studio CPP compiler to compile the .CU file as though it was a header. You need to have a header file that declares doStuff() and include the header not the definition.

The following might be helpful.

Typically I set this up as two projects. One project that compiles against the the 2008 CPP compiler for .CU and another that uses the 2010 compiler to get all the C++0x features.

The warnings your getting can be fixed by exporting the appropriate templates. Something like this but you'll have to write a specific one for each of the warning types.

#if defined(__CUDACC__)
#define DECLSPECIFIER  __declspec(dllexport)

#define DECLSPECIFIER  __declspec(dllimport)
#define EXPIMP_TEMPLATE extern

EXPIMP_TEMPLATE template class DECLSPECIFIER thrust::device_vector<unsigned long>;

See:;EN-US;168958 and

I've written a step-by-step guide to setting up VS 2010 and CUDA 4.0 here

BTW: A better way of timing CUDA code is with the event API.

cudaEvent_t start, stop; 
float time;
cudaEventRecord( start, 0 ); 
kernel<<<grid,threads>>> ( d_odata, d_idata, size_x, size_y, NUM_REPS); 
cudaEventRecord( stop, 0 ); 
cudaEventSynchronize( stop ); 
cudaEventElapsedTime( &time, start, stop );
cudaEventDestroy( start );
cudaEventDestroy( stop );

Upvotes: 1

Kristian D&#39;Amato
Kristian D&#39;Amato

Reputation: 4046

I was including the .cu file directly. Of course, that's pretty much including the CUDA code in the .cpp file, and hence the error!

Upvotes: 0

Related Questions