Possible options to calculate a part of paralleled Program over GPU

Question

Hi I am not so familiar with gpu and I Just have a theoretical question.

So I am working on an application called Sassena, which calculates Neutron scattering from Molecular dynamics trajectories. This application is written in parallel with MPI and works for CPUs very well. But I am willing to run this app over GPU to make it faster. ofcourse not all of it but partly. when I look at the Source Code, The way it works is typical MPI, meaning the first rank send the data to each node individually and then each nodes does the calculation. Now, there is a part of calculation which is using Fast Fourier Transform(FFT), which consumes the most time and I want to send this part to GPU.

I see 2 Solutions ahead of me:

when the nodes reach the FFT part, they should send back the data to the main node, and when the main node gathered all the data it sends them to GPU, then GPU does the FFT, sends it back to cpu and cpu does the rest.
Each node would dynamically send the data to GPU and after the GPU does the FFT, it sends back to each node and they do the rest of their job.

So my Question is which one of these two are possible. I know first one is doable but it is having a lot of communication which is time consuming. But the second way I don't know if it is possible at all or not. I know in the second case it will be dependent on the Computer architecture as well. But is CUDA or OpenCL capable of this at all??

Thanks for any idea.

Possible options to calculate a part of paralleled Program over GPU

Answers (1)

Related Questions