Reputation: 1141
I implemented a simple FFT below (the final scaling is ignored):
typedef complex<double> base;
vector<base> w;
int FFTN = 1024;
void fft(vector<base> &fa){
int n = fa.size();
if (n==1) return;
int half = (n>>1);
vector<base> odd(half),even(half);
for(int i=0,j = 0;i<n;i+=2,j++) {
even[j] = fa[i];
odd[j] = fa[i+1];
}
fft(odd);
fft(even);
int fact = FFTN/n;
for (int i=0;i<half;i++){
fa[i] = even[i] + odd[i] * w[i * fact];
fa[i + half] = even[i] - odd[i] * w[i * fact];
}
}
It works well. But I am stuck in converting it to a iterative form. What I have tried so far:
int n = fa.size();
int fact = (FFTN>>1);
int half = 1;
while(half<n){
for(int i=0;i<n/half;i+=2){
base even = fa[i], odd = fa[i+1];
fa[i] = even + odd * w[i*fact];
fa[i+half] = even - odd*w[i*fact];
}
for(int j=0;j<n/half;j++)
fa[j] = fa[j+half];
fact >>= 1;
half <<= 1;
}
Can someone help me with the tricks on the conversion?
Upvotes: 2
Views: 629
Reputation: 275490
The first thing I'd do is make your function "more recursive".
void fft(base* fa, size_t stride, size_t n) {
if (n==1) return;
int half = (n>>1);
fft(fa+stride, stride*2, half); // odd
fft(fa, stride*2, half); // even
int fact = FFTN/n;
for (int i=0;i<half;i++){
fa[i] = fa[stride*2*i] + fa[stride*2+i+stride] * w[i * fact];
fa[i + half] = fa[stride*2*i] - fa[stride*2+i+stride] * w[i * fact];
}
}
void fft(std::vector<base>& fa){ fft(fa.data(), 1, fa.size()); }
now we do our fft
in-place within the buffer.
As we now have two different fft
implementations, we can test them against each other. Build some unit tests at this point, so further changes can be tested against a known "good" (or at least stable) set of behavior.
Next, we can examine the order which elements are combined in the original vector. Examine a buffer of length 4.
a b c d
we recurse doing odd and even
a[e] b[o] a[e] d[o]
Which then recurse doing odd and even
a[ee] b[oe] a[eo] d[oo]
these sets are size 1. They are left alone, then we combine on odd and even.
Now we look at 8. After two recursions, the elements are 'owned' by:
0[ee] 1[oe] 2[eo] 3[oo] 4[ee] 5[oe] 6[eo] 7[oo]
and after 3:
0[eee] 1[oee] 2[eoe] 3[ooe] 4[eeo] 5[oeo] 6[eoo] 7[ooo]
if we reverse those labels, and call e
0
and o
1
, we get:
0[000] 1[001] 2[010] 3[011] 4[100] 5[101] 6[110] 7[111]
which is binary counting. The first bit is discarded, and the elements who are now equal are combined in the 2nd to last recursive call.
Then the first two bits are discarded, and the elements with matching last bit are combined.
We can instead of looking at bits, look at the start of each combination, and the stride length.
The first combo is stride length equal to array length (1 element per).
The second is length/2. The third is length/4.
This continues until stride length 1.
The number of sub-arrays to combine is equal to the stride length.
So
for(size_t stride = n; stride = stride/2; stride!=0) {
for (size_t offset = 0; offset != stride; ++offset) {
fft_process( array+offset, stride, n/stride );
}
}
where fft_process
is based off of:
int fact = FFTN/n;
for (int i=0;i<half;i++){
fa[i] = fa[stride*2*i] + fa[stride*2+i+stride] * w[i * fact];
fa[i + half] = fa[stride*2*i] - fa[stride*2+i+stride] * w[i * fact];
}
maybe something like:
void fft_process( base* fa, size_t stride, size_t n ) {
int fact = FFTN/n; // equals stride I think! Assuming outermost n is 1024.
for (int i=0;i<half;i++){
fa[i] = fa[stride*2*i] + fa[stride*2+i+stride] * w[i * fact];
fa[i + half] = fa[stride*2*i] - fa[stride*2+i+stride] * w[i * fact];
}
}
none of this is tested, but it gives a step by step example of how to do this. You'll want to unleash your unit tests written earlier (to test the two earlier versions of fft
) on this iterative version.
Upvotes: 1
Reputation: 11201
This doesn't exactly provide a solution. But might help some who solves similar problems for converting recursive algorithms to iterative ones. Recursion is implemented in system with a stack. Each recursive call to a method pushes, following information onto the stack:
If the programmer could do the above with a stack + while loop
, we could implement recursive algorithms to iterative ones. Steps will be
A code sample for iterative factorial calculation using the above method.
int coreLogic( int current, int recursiveParameter ) {
return current * recursiveParameter ;
}
int factorial( int n ) {
std::stack<int> parameterStack ;
int tempFactorial = 1;
//parameters that would have been used to invoke the recursive call will now be pushed to stack
parameterStack.push( n );
while( !parameterStack.empty() ) {
//popping arguments from stack
int current = parameterStack.top();
parameterStack.pop();
//and invoking core logic
tempFactorial = coreLogic( tempFactorial, current );
if( current > 1 ) {
//parameters that would have been used to invoke the recursive call will now be pushed to stack
parameterStack.push( current - 1 );
}
/*
*if a divide and conquer algorithm like quick sort then again push right side args to stack
* - appers case in question
*if( condition ) {
* parameterStack.push( args );
*}
*/
}
return tempFactorial;
}
Upvotes: 1
Reputation: 18556
Here is my implementation:
typedef complex<double> Data;
const double PI = acos(-1);
// Merges [low, (low + high) / 2) with [(low + high) / 2, high) parts.
void merge(vector<Data>& b, int low, int high) {
int n = high - low;
Data cur(1), mul(cos(2. * PI / n), sin(2. * PI / n));
for (int i = low; i < low + n / 2; i++) {
Data temp = b[i + n / 2] * cur;
b[i + n / 2] = b[i] - temp;
b[i] = b[i] + temp;
cur = cur * mul;
}
}
// Computes FFT for the vector b.
void do_fft(vector<Data>& b) {
int n = b.size();
int hi = 0;
while ((1 << hi) < n)
hi++;
hi--;
// Permutes the input vector in a specific way.
vector<int> p(n);
for (int i = 0; i < n; i++)
for (int b = hi; b >= 0; b--)
if (i & (1 << b))
p[i] |= (1 << (hi - b));
vector<Data> buf(n);
for (int i = 0; i < n; i++)
buf[i] = b[p[i]];
copy(buf.begin(), buf.end(), b.begin());
for (int h = 2; h <= n; h *= 2)
for (int i = 0; i < n; i += h)
merge(b, i, i + h);
}
The idea of this implementation is to permute the given vector in such a way that we need to merge adjacent sub vectors at each step(that is, [0, 0] with [1, 1], [2, 2] with [3, 3] and so on at the first step, [0, 1] with [2, 3], [4, 5] with [6, 7] at the second step and so forth). It turns out that the elements should be permuted in the following way: we should take a binary representation of element's index, reverse it, and put the element with the reversed index to the current position. I cannot prove it rigorously, but drawing small pictures for n = 8
or n = 16
helps to understand that it is correct.
Upvotes: 1