Reputation: 1357
I have a C++ class-based dll. I'd like to convert some of the class members to CUDA based operation.
I am using VS2012, WINDOWS 7, CUDA6.5, sm_20;
Say the original SuperProjector.h file is like:
class __declspec(dllexport) SuperProjector
{
public:
SuperProjector(){};
~SuperProjector(){};
void sumVectors(float* c, float* a, float* b, int N);
};
and the original sumVector()
function in SuperProjector.cpp
void SuperProjector::sumVectors(float* c, float* a, float* b, int N)
{
for (int n = 1; n < N; b++)
c[n] = a[n] + b[n];
}
I am stuck on how I should convert sumVector() to CUDA. Specifically:
__global__ __device__
keywords in front
of class members will work, but so I need to change the suffix of
the cpp file to cu?I am very confused what is the best way to convert some of the members of tthis C++ class based dll into some CUDA kernel functions. I appreciate anyone can offer some ideas, or better with some very simple examples.
Upvotes: 0
Views: 1118
Reputation: 3850
Create CUDA project, let's call it cudaSuperProjector
and add two files SuperProjector.cu
and SuperProjector.h
cudaSuperProjector.h
class __declspec(dllexport) cudaSuperProjector {
public:
cudaSuperProjector(){ }
~cudaSuperProjector(){ }
void sumVectors(float* c, float* a, float* b, int N);
};
cudaSuperProjector.cu
#include <stdio.h>
#include "cuda_runtime.h"
#include "device_launch_parameters.h"
#include "cudaSuperProjector.h"
__global__ void addKernel(float *c, const float *a, const float *b) {
int i = threadIdx.x;
c[i] = a[i] + b[i];
}
// Helper function for using CUDA to add vectors in parallel.
cudaError_t addWithCuda(float *c, const float *a, const float *b, unsigned int size) {
float *dev_a = 0;
float *dev_b = 0;
float *dev_c = 0;
cudaError_t cudaStatus;
// Choose which GPU to run on, change this on a multi-GPU system.
cudaStatus = cudaSetDevice(0);
// Allocate GPU buffers for three vectors (two input, one output) .
cudaStatus = cudaMalloc((void**)&dev_c, size * sizeof(float));
cudaStatus = cudaMalloc((void**)&dev_a, size * sizeof(float));
cudaStatus = cudaMalloc((void**)&dev_b, size * sizeof(float));
// Copy input vectors from host memory to GPU buffers.
cudaStatus = cudaMemcpy(dev_a, a, size * sizeof(float), cudaMemcpyHostToDevice);
cudaStatus = cudaMemcpy(dev_b, b, size * sizeof(float), cudaMemcpyHostToDevice);
// Launch a kernel on the GPU with one thread for each element.
addKernel << <1, size >> >(dev_c, dev_a, dev_b);
// Check for any errors launching the kernel
cudaStatus = cudaGetLastError();
// cudaDeviceSynchronize waits for the kernel to finish, and returns
// any errors encountered during the launch.
cudaStatus = cudaDeviceSynchronize();
// Copy output vector from GPU buffer to host memory.
cudaStatus = cudaMemcpy(c, dev_c, size * sizeof(float), cudaMemcpyDeviceToHost);
return cudaStatus;
}
void cudaSuperProjector::sumVectors(float* c, float* a, float* b, int N) {
cudaError_t cudaStatus = addWithCuda(c, a, b, N);
if (cudaStatus != cudaSuccess) {
fprintf(stderr, "cudaSuperProjector::sumVectors failed!");
}
}
Note: In properties of file cudaSuperProjector.cu
Item Type
should be CUDA C/C++
.
Go to properties of the project and in General
set value of Configuration Type
to Dynamic Library (.dll)
. Now everything for creating library is ready. Compile this project and in output folder you will find cudaSuperProjector.dll
and cudaSuperProjector.lib
. Create directory cudaSuperProjector\lib
and copy cudaSuperProjector.dll
and cudaSuperProjector.lib
there. Also create cudaSuperProjector\include
and copy cudaSuperProjector.h
in it.
Create another Visual C++
project, let's call it SuperProjector
. Add file SuperProjector.cpp
to the project.
SuperProjector.cpp
#include <stdio.h>
#include "cudaSuperProjector/cudaSuperProjector.h"
int main(int argc, char** argv) {
float a[6] = { 0, 1, 2, 3, 4, 5 };
float b[6] = { 1, 2, 3, 4, 5, 6 };
float c[6] = { };
cudaSuperProjector csp;
csp.sumVectors(c, a, b, 6);
printf("c = {%f, %f, %f, %f, %f, %f}\n",
c[0], c[1], c[2], c[3], c[4], c[5]);
return 0;
}
In properties of the project add path to the dll
and lib
files to the VC++ Directories -> Library Directories
, for example D:\cudaSuperProjector\lib;
, in VC++ Directories -> Include Directories
add path to the header, for example D:\cudaSuperProjector\include;
. Then go to the Linker -> Input
and add cudaSuperProjector.lib;
.
Now your project should compile fine, but when you run it it will show you the error
The program can't start because cudaSuperProjector.dll is missing from your computer. Try reinstalling the program to fix this problem.
You need to copy cudaSuperProjector.dll
to the output folder of the project, so it will be under the same folder as SuperProjector.exe
. You can do it manually or add
copy D:\cudaSuperProjector\lib\cudaSuperProjector.dll $(SolutionDir)$(Configuration)\
in Build Events -> Post-Build Events -> Command Line
,
where $(SolutionDir)$(Configuration)\
is output path for solution (see Configuration Properties -> General -> Output Directory
).
Upvotes: 3