Reputation: 31
I'm making a medical imaging equipment. I want to use CUDA for making faster equipment
I receive 1024 size 1d data from CCD 512 times. before I perform IFFT I have to apply high performance interpolation algorithm (like cubic spline interpolation) to the 1024 size data each (then 1d interpolation 512 times).
Is there any CUDA library to perform cubic spline interpolation? (I found that there is one library, but it is for 2 or 3 dimensional image. Since I need to perform other complicated filtering functions, I need the data on the global memory, not on the texture memory.)
Is there any NUFFT (non uniform fast Fourier transform) library (doesn't need to be written for CUDA)? I'm thinking that if I have NUFFT function, I don't have to do interpolation and IFFT separately which is possible for making even faster equipment.
Upvotes: 3
Views: 5054
Reputation: 6917
Ruijters' bi/tricubic spline interpolation, which is I think what you refer to http://dannyruijters.nl/cubicinterpolation, (edited!) now works with 1D data, thank you! See Danny Ruijters' answer on this page.
Here're a few NUFFT implementations that I'm aware of, and brief thoughts on them.
I'm working, at a glacial pace, on porting Fessler's algorithm to Python/Cython, and maybe CUDA ("maybe" because just zero-padding the standard (CU)FFT and linear interpolation seems to work well enough). Best of luck.
Upvotes: 2
Reputation: 3420
Since more people have asked this, I have extended my CUDA cubic interpolation code with 1D cubic interpolation as well. You can find the updated code here: http://www.dannyruijters.nl/cubicinterpolation/
A working CUDA example that also contains 1D cubic interpolation can be found in the cudaAccuracyTest sample in the examples subdirectory in CI.zip.
For those of you who are more interested in a SSE approach, I have some working SSE optimized multi-threaded cubic interpolation code (albeit in 3D, not 1D) in the referenceCubicTexture3D sample in the examples subdirectory.
edit: The cubic interpolation code is now available on github. The 1D cubic interpolation code is here.
Upvotes: 3
Reputation: 5695
To be honest, your parallelism seems to be a bit low for the GPU. A 6core with SSE optimizations might outperform a GPU here.
Upvotes: 0
Reputation: 1800
I don't know about that algorithm, but if what you've found you think fast enough for your equipment, then why dont you change the implementation from using texture memory to just a simple array, and maybe you can do more speedup using shared memory?
I've found some written in matlab and fortran 77:
http://www.cims.nyu.edu/cmcl/nufft/nufft.html
http://www.mathworks.com/matlabcentral/fileexchange/25135-nufft-nufft-usffft
Upvotes: 0