ocerv
ocerv

Reputation: 51

Can NPP functions be called as a device function?

Can NPP functions, more concrete npps (https://docs.nvidia.com/cuda/npp/group__npps.html) be called as a device function?

If I create a global function can I inside call npps functions as nppsMaxIndx_32f (to compute max of a vector)?

Example: I have 100 vectors of 10000 floats each, if I do it in host code I have to make 100 calls to npp function

If I make a global function of 100 threads and inside call the npp function for each vector so they launch simultaneously, will this work? nppsMaxIndx_32f can be called as a device function?

Upvotes: 3

Views: 393

Answers (1)

ocerv
ocerv

Reputation: 51

This is not possible -- NPP functions are host only functions. Trying will produce errors:

functions.cu(237): error: calling a __host__ function("nppsMaxIndx_32f") from a 
__global__ function("computeMax") is notallowed

functions.cu(237): error: identifier "nppsMaxIndx_32f" is undefined in device code

However, making the call in host code without a synchronization of the GPU will call them almost simultaneously without waiting for the previous one to finish, but this can only be done safely if there is no requirement for ordering of the calls and the data for overlapping calls is fully independent.

Upvotes: 1

Related Questions