MrEtArn
MrEtArn

Reputation: 325

what could cause a segmentation fault stemming from fftw_destroy_plan

I have a large c++ program that runs using mpirun and includes multi-thread FFTW.

All FFTW actions are done using a wrapper class. I will not post the whole class as it contains different structures and classes but the relevant part of the constructor is:

int N_threads;
   fftw_complex *work;
   fftw_plan forward,backward;
   ...
   ...
   if(!fftw_init_threads()) error("Failed to initialize multitread fftw at nfft.h");
    int max_thread=omp_get_max_threads();
    fftw_plan_with_nthreads(N_threads); 
    if((!slave)&(max_thread<N_threads)) printf("A request to create an fftw with %i threads, the maximum available thread number is %i \n",N_threads,max_thread);
    for (int i=0; i < N.Dimension(); ++i) siz *= N(i);
    work = (fftw_complex*) fftw_malloc(sizeof(fftw_complex)*siz);
    int sign = FFTW_FORWARD;
    forward = fftw_plan_dft(N.Dimension(),&N(0), work, work,sign,Flags);
    sign = FFTW_BACKWARD;
    backward = fftw_plan_dft(N.Dimension(),&N(0), work, work, sign,Flags);

and the destructor includes the following commands:

cfftw_free(work);
if (forward)  fftw_destroy_plan(forward);
if (backward) fftw_destroy_plan(backward);
fftw_cleanup_threads();

during the destruction of some of the runs at some of the mpirun nodes I get a segmentation fault with the following message

[nina14:13154] [ 0] /lib/x86_64-linux-gnu/libpthread.so.0(+0x10340) [0x7ff474164340]
[nina14:13154] [ 1] /usr/lib/x86_64-linux-gnu/libfftw3.so.3(+0x2a36c) [0x7ff47550636c]
[nina14:13154] [ 2] /usr/lib/x86_64-linux-gnu/libfftw3.so.3(fftw_plan_awake+0x16) [0x7ff4754ff626]
[nina14:13154] [ 3] /usr/lib/x86_64-linux-gnu/libfftw3_threads.so.3(+0x31d0) [0x7ff4752d81d0]
[nina14:13154] [ 4] /usr/lib/x86_64-linux-gnu/libfftw3.so.3(fftw_plan_awake+0x16) [0x7ff4754ff626]
[nina14:13154] [ 5] /usr/lib/x86_64-linux-gnu/libfftw3.so.3(fftw_destroy_plan+0x13) [0x7ff4755cf723]

I hear that memory leak are hard to find but I run over all data I could and still couldn't find the source.

valgrind seems to have problems tracing the symbols used by mpirun and even when running with -v --leak-check=full mpirun --trace-children=yes stopped monitoring right after the mpirun was embarked with Syscall param writev(vector[...]) points to uninitialised byte(s)I couldn't yield any valuable (for me) information

I'm looking both for clues how to solve this bug and tips regarding the use of valgrind or other programs to locate it

EDIT: eventually I was able to run valgrind directly on the executable ( without the mpirun parameters and it pointed to the same area:

==21869== Invalid read of size 8
==21869==    at 0x5BFB36C: ??? (in /usr/lib/x86_64-linux-gnu/libfftw3.so.3.3.2)
==21869==    by 0x5BF4625: fftw_plan_awake (in /usr/lib/x86_64-linux-gnu/libfftw3.so.3.3.2)
==21869==    by 0x5BF4625: fftw_plan_awake (in /usr/lib/x86_64-linux-gnu/libfftw3.so.3.3.2)
==21869==    by 0x5CC4722: fftw_destroy_plan (in /usr/lib/x86_64-linux-gnu/libfftw3.so.3.3.2)
==21869==    by 0x4BAB29: CartesianInterpreter::~CartesianInterpreter() (nfft.h:216)

and

==21869== Process terminating with default action of signal 11 (SIGSEGV)
==21869==  Access not within mapped region at address 0x790
==21869==    at 0x5BFB36C: ??? (in /usr/lib/x86_64-linux-gnu/libfftw3.so.3.3.2)
==21869==    by 0x5BF4625: fftw_plan_awake (in /usr/lib/x86_64-linux-gnu/libfftw3.so.3.3.2)
==21869==    by 0x5BF4625: fftw_plan_awake (in /usr/lib/x86_64-linux-gnu/libfftw3.so.3.3.2)
==21869==    by 0x5CC4722: fftw_destroy_plan (in /usr/lib/x86_64-linux-gnu/libfftw3.so.3.3.2)
==21869==    by 0x4BAB29: CartesianInterpreter::~CartesianInterpreter() (nfft.h:216)

how could I make fftw_plan_awake invoke a memory leak?

EDIT2: as suggested I'm attaching a larger fraction of the original code (just copy-paste)

class fftwizers {
public:
  int N_threads;
  IVector N;
  CVector Work;
  fftw_complex *work;
  fftw_plan forward,backward;
  fftwizers(){}
 fftwizers(IVector& iN,int n_threads=default_threads,unsigned Flags=FFTW_MEASURE):N(iN),N_threads(n_threads)
  {
    cfftwizers(Flags);
  }
 fftwizers(IVector& iN,CVector &input,int n_threads=default_threads,unsigned Flags=FFTW_MEASURE):N(iN),N_threads(n_threads)
  {
    cfftwizers(Flags);
//    input.ReDimension(Work.Dimension(),(complex*) &work[0]);
    input.ReDimension(Work);
  }

 fftwizers(int n,int n_threads=default_threads,unsigned Flags=FFTW_MEASURE):N(1),N_threads(n_threads)
    {
      N(0) = n;
      cfftwizers(Flags);
    }

 void ReDimension(IVector& iN,int n_threads=default_threads,unsigned Flags=FFTW_MEASURE)
  {
    N.ReDimension(iN);
    N_threads=n_threads;
    cfftwizers(Flags);
  }
 void ReDimension(int n,int n_threads=default_threads,unsigned Flags=FFTW_MEASURE)
    {
      N.ReDimension(1);
      N(0) = n;
      N_threads=n_threads;
      cfftwizers(Flags);
    }

  void cfftwizers(unsigned Flags)
  {
    int siz = 1;
    if(!fftw_init_threads()) error("Failed to initialize multitread fftw at nfft.h");
    int max_thread=omp_get_max_threads();
    fftw_plan_with_nthreads(N_threads); // every palns created after this line will use ## threads
    if((!slave)&(max_thread<N_threads)) printf("A request to create an fftw with %i threads, the maximum available thread number is %i \n",N_threads,max_thread);
    for (int i=0; i < N.Dimension(); ++i) siz *= N(i);
    //omp_set_num_threads(N_threads);
    //omp_set_dynamic(false);
    //work = fftw_alloc_complex(siz);
    //fftw_complex *work = (fftw_complex*) fftw_malloc(sizeof(fftw_complex)*siz);
    work = (fftw_complex*) fftw_malloc(sizeof(fftw_complex)*siz);
    Work.ReDimension(siz,(complex*) &work[0]);
//    if(!slave) printf("\nBuilding fftwizer with %i as a flag",Flags);
    int sign = FFTW_FORWARD;
    forward = fftw_plan_dft(N.Dimension(),&N(0), work, work,sign,Flags);
    sign = FFTW_BACKWARD;
    backward = fftw_plan_dft(N.Dimension(),&N(0), work, work, sign,Flags);
  }
  ~fftwizers()
    {
      fftw_free(work);
      if (forward)  fftw_destroy_plan(forward);
      if (backward) fftw_destroy_plan(backward);
      fftw_cleanup_threads();
    }
  void go(int sign,CVector& Arr)
  {
    Work = Arr;
    if (sign==FFTW_FORWARD)
      fftw_execute(forward);
    else

... ...

Upvotes: 1

Views: 965

Answers (1)

MrEtArn
MrEtArn

Reputation: 325

Eventually the cause of this bug was

fftw_cleanup_threads();

It was activated when the first object of the class was destructed and it deleted some important global data used by all FFTW plans (even ones that were constructed elsewhere). The solution was to remove this line as it doesn't do much anyway

Upvotes: 1

Related Questions