Reputation: 21
I want to perform a 2D real-to-complex FFT with the clFFT library. The output array just holds zeros for the real- and imaginary-part which is not correct (I have a working version implemented with fftw3). The input array float *in
is correct. The length of the dimensions (in the code below) are receiver_number
for N0
and signal_length
for N1
.
I have already checked the OpenCL initialization process for faults, but received no other messages than CL_SUCCESS
.
cols
and rows
are resulting dimensions for the complex output (hermitian symmetry). The output array has the size sizeof(float) * cols * rows * 2
, due to the real and imaginary numbers. The strides have to be set in order to tell clFFT the "distance between elements in all dimensions of the input/output buffer".
auto cols = (signal_length / 2 + 1);
auto rows = receiver_number;
float *data_ptr = in; // in is a float pointer of size receiver_number * signal_length
float *out = new float[cols * rows * 2]; // to store real- & imag-values
of the complex number next to each other (hermitian symmetry)
/* Prepare OpenCL memory objects and place data inside them. */
cl_int err;
cl_mem data_ptr_d = clCreateBuffer( context, CL_MEM_READ_WRITE, sizeof(float) * receiver_number * signal_length, NULL, &err );
cl_mem out_d = clCreateBuffer( context, CL_MEM_READ_WRITE, sizeof(float) * cols * rows * 2, NULL, &err );
err = clEnqueueWriteBuffer( queue, data_ptr_d, CL_TRUE, 0, sizeof(float) * receiver_number * signal_length, data_ptr, 0, NULL, NULL);
/* Create a default plan for FFT. */
const size_t N0 = receiver_number, N1 = signal_length;
clfftPlanHandle forward_plan;
size_t clLengthsForward[2] = {N0, N1};
err = clfftCreateDefaultPlan(&forward_plan, context, CLFFT_2D, clLengthsForward);
/* Set input and output stride */
size_t clStridesForwardIn[2] = {1, (unsigned long)signal_length};
size_t clStridesForwardOut[2] = {1, (unsigned long)cols};
err = clfftSetPlanInStride(forward_plan, CLFFT_2D, clStridesForwardIn);
err = clfftSetPlanOutStride(forward_plan, CLFFT_2D, clStridesForwardOut);
err = clfftSetPlanDistance(forward_plan, receiver_number * signal_length, cols * rows * 2);
/* Set plan parameters. */
err = clfftSetPlanPrecision(forward_plan, CLFFT_SINGLE);
err = clfftSetLayout(forward_plan, CLFFT_REAL, CLFFT_HERMITIAN_INTERLEAVED);
err = clfftSetResultLocation(forward_plan, CLFFT_OUTOFPLACE); // store output in seperate array
/* Bake the plan. */
err = clfftBakePlan(forward_plan, 1, &queue, NULL, NULL);
/* Execute the plan. */
err = clfftEnqueueTransform(forward_plan, CLFFT_FORWARD, 1, &queue, 0, NULL, NULL, &data_ptr_d, &out_d, NULL);
/* Wait for calculations to be finished. */
err = clFinish(queue);
/* Fetch results of calculations. */
err = clEnqueueReadBuffer( queue, out_d, CL_TRUE, 0, sizeof(float)* cols * rows * 2, out, 0, NULL, NULL );
I receive just zeros for both the real- and imaginary part of the complex values. I cant really explain the error, since the set strides and memory distances make sense to me. All memory allocation/transfer and kernel execution calls yield a CL_SUCCESS
.
Upvotes: 0
Views: 84
Reputation: 21
I know that my question is probably quite application specific, but for me the solution was that I had to change N0
and N1
on the clLengths
array. I just copied the N0
and N1
values from my fftw_plan_dft_r2c_2d()
function call in the working fftw3
implementation assuming they are equivalent.
Upvotes: 0