Muhammad Gulfam
Muhammad Gulfam

Reputation: 71

How to enable separate compilation for CUDA project in Visual Studio

I am new to CUDA. I am trying to write an application where I am calling one kernel function from another kernel function. But I am getting an error "kernel launch from device or global functions requires separate compilation mode" while building the application. Here is my complete code. Any help would be appreciated.

#include<iostream>
#include<curand.h>
#include<cuda.h>
#include <curand_kernel.h>
#include <stdlib.h>
#include <stdio.h>
using namespace std;

__device__ int *vectorData;
__device__ void initializeArray(int elementCount)
{
    for (int i = 0; i < elementCount; i++)
    {
        vectorData[i] = 1;
    }
}
__global__ void AddOneToEachElement(int elementCount)
{
    for (int i = 0; i < elementCount; i++)
    {
        vectorData[i] = vectorData[i]+1;
    }
}
__global__ void addKernel(int *numberOfElements)
{
    vectorData = (int*)malloc(sizeof(int));
    initializeArray(*numberOfElements);
    int gridSize = ceil((*numberOfElements) / 1024) + 1;
    AddOneToEachElement << <gridSize, 1024 >> > (*numberOfElements);
    cudaDeviceSynchronize();
    free(vectorData);
}

int main()
{
    int numberOfElements = 1;
    int *device_numberOfElements;
    cudaMalloc((int**)&device_numberOfElements, sizeof(int));
    cout << "Enter the Number of elements" << endl;
    cin >> numberOfElements;
    cudaMemcpy(device_numberOfElements, &(numberOfElements), sizeof(int), cudaMemcpyHostToDevice);
    addKernel << <1, 1 >> > (device_numberOfElements);
    cudaFree(device_numberOfElements);
    return 0;
}

Upvotes: 2

Views: 2219

Answers (1)

Muhammad Gulfam
Muhammad Gulfam

Reputation: 71

The issue resolved using the information available on the following link Using CUDA dynamic parallelism in Visual Studio

Here is the complete information that I obtained from the above mentioned link:

Starting from CUDA 5.0, CUDA enables the use of dynamic parallelism for GPUs with compute capability 3.5 or higher. Dynamic parallelism allows launching kernels directly from other kernels and enables further speedups in those applications which can benefit of a better handling of the computing workloads at runtime directly on the GPU; in many cases, dynamic parallelism avoids CPU/GPU interactions with benefits to mechanisms like recursion. To use dynamic parallelism in Visual Studio 2010 or Visual Studio 2013, do the following:

  • View -> Property Pages
  • Configuration Properties -> CUDA C/C++ -> Common -> Generate Relocatable Device Code -> Yes (-rdc=true)
  • Configuration Properties -> CUDA C/C++ -> Device -> Code Generation -> compute_35,sm_35
  • Configuration Properties -> Linker -> Input -> Additional Dependencies -> cudadevrt.lib

Upvotes: 4

Related Questions