Reputation: 499
I know when dealing with std::vector
, it's best to use std::copy()
, however I was wondering how to copy an fftw_malloc
array into a std::vector
? Should I use std::copy()
or memcpy()
?
Example code:
static const int nx = 8;
static const int ny = 8;
static const int nyk = ny/2 + 1;
static const int Mx = 3*nx/2, My= 3*ny/2;
double M[2] = {Mx,My};
fftw_complex *array1;
array1= (fftw_complex*) fftw_malloc(nx*nyk* sizeof(fftw_complex));
memset(array1, 42, nx*nyk* sizeof(fftw_complex));
std::vector<std::complex<double>> array2(Mx*My,0.0);
std::copy( array1, array1+(nx*nyk) , array2); //std::copy( src, src + size, dest );
I am trying to copy array1
into array2
, where array2 > array1
, then would like to use std::vector
operators to append values, and so on.
This doesn't really work, and I am not sure if my vector
syntax is off, or std::copy()
is not the way to go. Should I use memcpy()
instead?
Upvotes: 1
Views: 399
Reputation: 16242
These days, it doesn't make a lot of difference to use fftw_malloc
vs. using regularly allocated memory for FFTW operations (see timing results at the end).
Having said that,
the problem in your code is two-fold.
First, the last argument of std::copy
needs to be an iterator (e.g. array2.begin()
).
Second, the value type is incompatible; you are transforming fftw_complex
into std::complex<double>
, which are different types (albeit they have the same bitwise representation).
So you have several options:
std::copy( reinterpret_cast<std::complex<double>*>(array1), reinterpret_cast<std::complex<double>*>(array1+(nx*nyk)) , array2.begin());
Which, in turn, can be simplified to
std::copy_n( reinterpret_cast<std::complex<double>*>(array1), nx*nyk, array2.begin());
If you don't like to reinterpret_cast
you can transform the data.
std::transform( array1, array1 + nx*nyk, array2.begin(), [](auto const& z) {return std::complex<double>(z[0], z[1])});
(there is no std::transform_n
).
You can skip the two-step process:
std::vector<std::complex<double>> array2(reinterpret_cast<std::complex<double>*>(array1), reinterpret_cast<std::complex<double>*>(array1+(nx*nyk)));
Honestly, I would try to make everything work with normal allocation.
Besides, what do you gain by allocating with fftw_malloc
to copy immediately to a regular std::vector
, possibly losing the alignment properties?
(I hope you are trying to copy to the vector after the FFTW operation.)
The most elegant solution is to have an FFTW-aware allocator altogether. https://gitlab.com/correaa/boost-multi/-/blob/master/include/multi/adaptors/fftw/memory.hpp?ref_type=heads#L27
using Z = std::complex<double>;
std::vector<Z, multi::fftw::allocator<Z>> array2(...);
In this way, you don't have to manage any memory and have the benefits of FFTW allocation, if any.
Finally, std::vector
is not even the best container to hold data subject to FFT operation;
if nothing else, because most interesting FFTs are multidimensional.
In my array library, FFTW is adapted and can be used in this way:
#include <multi/adaptors/fftw.hpp>
#include <multi/array.hpp>
namespace multi = boost::multi; // from https://gitlab.com/correaa/boost-multi
int main() {
multi::array<std::complex<double>, 2, multi::fftw::allocator<> > in({Mx, My});
...
multi::array<std::complex<double>, 2, multi::fftw::allocator<> > out({Mx, My});
multi::fftw::dft_forward({true, true}, in, out);
...
}
(code in this answer is not tested)
Timing results, fftw_malloc vs normal allocation:
I measured, and I don't see any benefit in using fftw_malloc
vs. new/std::allocator
.
These tests are for 1D and 2D data, run in a Intel® Core™ i7-9750H × 12 (48GB), running in Ubuntu 23.04 with FFTW version libfftw3-mpi3/lunar 3.3.10-1ubuntu1 amd64
and g++ 13 with -O3 -DNDEBUG
.
Upvotes: 2
Reputation: 117148
Background: The standard makes it defined to reinterpret_cast<double*>
from a std::complex<double>*
and vice versa. The memory layout of a std::complex<double>
is guaranteed to be that of two adjacent double
s in memory. Internally, it could be something like any of these:
template<class T>
struct complex1 {
T m_real;
T m_img;
};
//---
template<class T>
struct complex2 {
T m_data[2];
};
//---
template<class T> struct complex3;
template<>
struct complex3<double> {
double _Complex m_data; // The C complex type
};
std:complex<T>
is one of the few (maybe only?) class where you have this guarantee that it's ok to reinterpret_cast
a pointer to it and view it as-if through a pointer to another type, back and forth.
However, working with these different types isn't nice and clutters the code with reinterpret_cast
s so I would start by writing a wrapper, let's call it ComplexArray
, around the raw pointers that's returned by fftw_malloc
- and I'd use fftw_alloc_complex
instead of the more generic fftw_malloc
. The wrapper should give you as a C++ programmer the possibility of working with the data so that it can be used with the algorithms in the standard libary of C++.
I'd still use the fftw
allocation functions instead of just going for a std::vector<std::complex<double>>
directly (which is perfectly fine) because they "guarantee that the returned pointer obeys any special alignment restrictions imposed by any algorithm in FFTW (e.g. for SIMD acceleration)".
By providing this wrapper with iterator support, copying the data into a std::vector<std::complex<double>>
becomes just as easy as when working with any C++ container. You can even create the vector
directly from your wrapper's iterators.
The target would be the possibility to write your main
function like this:
int main() {
static const int nx = 8;
static const int ny = 8;
static const int nyk = ny / 2 + 1;
static const int Mx = 3 * nx / 2, My = 3 * ny / 2;
ComplexArray array1(nx * nyk); // the new C++ container
// no more memset(..., 42, ...), use real C++ std::complex<double>s:
array1.fill(std::complex<double>{1.4260258159703532e-105,
1.4260258159703532e-105});
// create a vector from your ComplexArray
std::vector<std::complex<double>> array2(array1.begin(), array1.end());
}
An example implementation of ComplexArray
could look like this and will work as a substitute for std::vector<std::complex<double>>
in most situations you want when working with fftw
so you may not even need to copy it to a vector
and you can add copying and creating it from iterators and other useful things if you need. This is a start:
#include <algorithm>
#include <complex>
#include <iterator>
#include <memory>
class ComplexArray {
public:
using value_type = std::complex<double>;
using pointer = value_type*;
using const_pointer = const value_type*;
using reference = value_type&;
using const_reference = const value_type&;
using iterator = value_type*;
using const_iterator = const value_type*;
using reverse_iterator = std::reverse_iterator<iterator>;
using reverse_const_iterator = std::reverse_iterator<const_iterator>;
ComplexArray() = default;
ComplexArray(std::size_t size) : m_data(fftw_alloc_complex(size)) {}
// move semantics comes for free, copying does not:
ComplexArray(ComplexArray&&) = default;
ComplexArray& operator=(ComplexArray&&) = default;
// misc ...
std::size_t size() const { return m_size; }
void fill(const std::complex<double>& val) { std::fill(begin(), end(), val); }
// access
const_pointer data() const {
return reinterpret_cast<const_pointer>(m_data.get());
}
pointer data() { return reinterpret_cast<pointer>(m_data.get()); }
const_reference operator[](std::size_t idx) const { return data()[idx]; }
reference operator[](std::size_t idx) { return data()[idx]; }
// functions to provide fftw functions direct access to the data
const fftw_complex* fftw_data() const { return m_data.get(); }
fftw_complex* fftw_data() { return m_data.get(); }
// iterators ...
const_iterator cbegin() const { return data(); }
const_iterator cend() const { return data() + m_size * sizeof(value_type); }
const_iterator begin() const { return cbegin(); }
const_iterator end() const { return cend(); }
iterator begin() { return data(); }
iterator end() { return data() + m_size * sizeof(value_type); }
// reverse iterators ...
reverse_const_iterator crbegin() const {
return reverse_const_iterator(cend());
}
reverse_const_iterator crend() const {
return reverse_const_iterator(cbegin());
}
reverse_const_iterator rbegin() const { return crbegin(); }
reverse_const_iterator rend() const { return crend(); }
reverse_iterator rbegin() { return reverse_iterator(end()); }
reverse_iterator rend() { return reverse_iterator(begin()); }
private:
struct fftw_freer {
void operator()(fftw_complex* cplx) const noexcept { fftw_free(cplx); }
};
std::unique_ptr<fftw_complex, fftw_freer> m_data;
std::size_t m_size = 0;
};
Upvotes: 3