Reputation: 53
So I'm trying to see if I can get some significant speedup from using a GPU to solve a small overdetermined system of equations by solving a bunch at the same time. My current algorithm involves using an LU decomposition function from the CULA Dense library that also has to switch back and forth between the GPU and the CPU to initialize and run the CULA functions. I would like to be able to call the CULA functions from my CUDA kernels so that I don't have to jump back to the CPU and copy the data back. This would also allow me to create multiple threads that are working on different data sets to be solving multiple systems concurrently. My question is can I call CULA functions from device functions? I know it's possible with CUBLAS and some of the other CUDA libraries.
Thanks!
Upvotes: 2
Views: 330
Reputation: 151849
The short answer is no. The CULA library routines are designed to be called from host code, not device code.
Note that CULA have their own support forums here which you may be interested in.
Upvotes: 4