Reputation: 333
I have tried to do a simple fft and compare the results between MATLAB and CUDA.
MATLAB: Vector of 9 numbers 1-9
I = [1 2 3 4 5 6 7 8 9];
and use this code:
fft(I)
gives the results:
45.0000 + 0.0000i
-4.5000 +12.3636i
-4.5000 + 5.3629i
-4.5000 + 2.5981i
-4.5000 + 0.7935i
-4.5000 - 0.7935i
-4.5000 - 2.5981i
-4.5000 - 5.3629i
-4.5000 -12.3636i
And CUDA code:
int FFT_Test_Function() {
int n = 9;
double* in = new double[n];
Complex* out = new Complex[n];
for (int i = 0; i<n; i++)
{
in[i] = i + 1;
}
// Allocate the buffer
cufftDoubleReal *d_in;
cufftDoubleComplex *d_out;
unsigned int out_mem_size = sizeof(cufftDoubleComplex)*n;
unsigned int in_mem_size = sizeof(cufftDoubleReal)*n;
cudaMalloc((void **)&d_in, in_mem_size);
cudaMalloc((void **)&d_out, out_mem_size);
// Save time stamp
milliseconds timeStart = getCurrentTimeStamp();
cufftHandle plan;
cufftResult res = cufftPlan1d(&plan, n, CUFFT_D2Z, 1);
if (res != CUFFT_SUCCESS) { cout << "cufft plan error: " << res << endl; return 1; }
cudaCheckErrors("cuda malloc fail");
cudaMemcpy(d_in, in, in_mem_size, cudaMemcpyHostToDevice);
cudaCheckErrors("cuda memcpy H2D fail");
res = cufftExecD2Z(plan, d_in, d_out);
if (res != CUFFT_SUCCESS) { cout << "cufft exec error: " << res << endl; return 1; }
cudaMemcpy(out, d_out, out_mem_size, cudaMemcpyDeviceToHost);
cudaCheckErrors("cuda memcpy D2H fail");
milliseconds timeEnd = getCurrentTimeStamp();
milliseconds totalTime = timeEnd - timeStart;
std::cout << "Total time: " << totalTime.count() << std::endl;
return 0;
}
In this CUDA code i got the result:
You can see that CUDA gives 4 zero's (cells 5-9).
What am i missed?
Thank you very much for your attention!
Upvotes: 1
Views: 517
Reputation: 212929
CUFFT_D2Z
is a real-to-complex FFT, so the top N/2 - 1
points in the output data are redundant - they are just the complex conjugate of the bottom half of the transform (you can see this in the MATLAB output if you compare pairs of terms which are mirrored about the mid-point).
You can fill in these "missing" terms if you need them, by just taking the complex conjugate of each corresponding term, but usually there isn't much point in doing this.
Upvotes: 3