Pass a __device__ lambda as argument to a __global__ function

Question

Defining __device__ lambdas is quite useful.

I wanted to do the same thing as the code below, but with a lambda defined in different files from the kernel that will use it.

// Sample code that works
template
__global__ void kernel(Func f){
    f(threadIdx.x);
}

int main(){
    auto f = [] __device__ (int i){ printf("Thread n°%i
",i); };
    kernel<<<1,16>>>(f);
}

I tried this (not working) implementation.

main.cu

#include "kernelFile.h"

int main(){
    auto f = [] __device__ (int i){ printf("Thread n°%i
",i); };
    kernelCaller(f);
}

kernelFile.cu

template
__global__ void kernel(Func f){
    f(threadIdx.x);
}

template
__host__ void kernelCaller(Func f){
    kernelCaller(f);
}

But the compiler complains because kernelCaller is never instantiated. I don't know if it's possible to instantiate it or not, or if what I'm trying to do should be implemented differently. Any hint on what I should do?

Pass a device lambda as argument to a global function

Answers (1)

Related Questions