Fridooo
Fridooo

Reputation: 21

Multi threaded FFTW 3.1.2 on a shared memory computer

I use FFTW 3.1.2 with Fortran to perform real to complex and complex to real FFTs. It works perfectly on one thread.

Unfortunately I have some problems when I use the multi-threaded FFTW on a 32 CPU shared memory computer. I have two plans, one for 9 real to complex FFT and one for 9 complex to real FFT (size of each real field: 512*512). I use Fortran and I compile (using ifort) my code linking to the following libraries:

-lfftw3f_threads -lfftw3f -lm -lguide -lpthread -mp

The program seems to compile correctly and the function sfftw_init_threads returns a non-zero integer value, usually 65527.

However, even though the program runs perfectly, it is slower with 2 or more threads than with one. A top command shows weird CPU load larger than 100% (and much more larger than n_threads*100). An htop command shows that one processor (let's say number 1) is working at a 100% load on the program, while ALL the other processors, including number 1, are working on this very same program, at a 0% load, 0% memory and 0 TIME.

If anybody has any idea of what's going on here... thanks a lot!

Upvotes: 2

Views: 1655

Answers (2)

xscott
xscott

Reputation: 2420

Unless your FFTs are pretty large, the automatic multithreading in FFTW is unlikely to be a win speed wise. The synchronization overhead inside the library can dominate the computation being done. You should profile different sizes and see where the break even point is.

Upvotes: 1

ire_and_curses
ire_and_curses

Reputation: 70162

This looks like it could be a synchronisation problem. You can get this type of behaviour if all threads except one are locked out e.g. by a semaphore to a library call.

How are you calling the planner? Are all your function calls correctly synchronised? Are you creating the plans in a single thread or on all threads? I assume you've read the notes on thread safety in the FFTW docs... ;)

Upvotes: 2

Related Questions