Reputation: 43
Basically I am solving the diffusion equation in 3D using FFT and one of the ways to parallelise this is to decompose the 3D FFT in 2D FFTs.
As described in this paper: https://cmb.ornl.gov/members/z8g/csproject-report.pdf
The way to decompose a 3d fft would be by doing:
2d fft in xy direction global transpose 1d fft in z direction
Basically, my problem is that I am not sure how to do this global transpose (as I assume it's transposing a 3d array I suppose). Anyone has came accross this? Thanks a lot.
Upvotes: 3
Views: 2504
Reputation: 8476
Think of a 3d cube with nx*ny*nz
elements. The 3d FFT of these elements is mathematically 3 stages of 1-d FFTs, one along each axis:
More generally, an N-dimensional FFT (N>1) is composed of many (N-1)-dimensional FFTs along that axis.
If the signal is real and you have an FFT that can return the half spectrum, then stage 1 would be about half as expensive (real FFT is cheaper), the remaining stages need to be complex, but they only need to have about half as many transforms. So the cost is roughly half.
If your 1d FFT can read input elements that are strided and pack the output into a contiguous buffer, then you end up doing a transposition at each stage.
This is how kissfft performs multi-dimensional FFTs.
P.S. When I need to get a mental pictures of higher dimensions, I think of: sheets of paper with matrices of numbers (2d), in folders of numbered papers (3d), in numbered filing cabinets (4d), in numbered rooms (5d), in numbered buildings (6d), and so on ... So I can visualize the "filing cabinet" dimension
Upvotes: 10
Reputation: 2429
The "global transposition" mentioned in the paper is not a mathematical operation, but a rearrangement of data between the distributed memory machines.
The data calculated on one machine in step 1 has to be transferred to all other machines, vice versa, for step to. It has nothing to do with a matrix transposition.
Upvotes: 2