Charles
Charles

Reputation: 987

Loss of precision in fortran fft

I'm having an issue with computing the fft of some data in Fortran. I don't know if there's something wrong with the algorithm, roundoff, lack of precision or what.

Here is the code

  module fft_mod
  public :: fft1D
  integer,parameter :: cp = selected_real_kind(14)
  real(cp),parameter :: PI = real(3.14159265358979,cp)
  contains
  subroutine fft1D(x)
    complex(cp), dimension(:), intent(inout)  :: x
    complex(cp), dimension(:), allocatable  :: temp
    complex(cp)                               :: S
    integer                                   :: N
    complex(cp)                               :: j ! sqrt(-1)
    integer                                   :: i,k
    N=size(x)
    allocate(temp(N))
    j = cmplx(0.0_cp,1.0_cp,cp)
    S = cmplx(0.0_cp,0.0_cp,cp)
    temp = cmplx(0.0_cp,0.0_cp,cp)
    do i = 1,N
      do k = 1,N
        S = S + x(k)*exp(real(-2.0,cp)*PI*j*real(k-1,cp)*real(i-1,cp)/real(N,cp))
      enddo
      temp(i) = S
      S = cmplx(0.0_cp,0.0_cp,cp)
    enddo
    x = temp
    deallocate(temp)
  end subroutine
  end module
  program test
    use fft_mod
    implicit none
    complex(cp), dimension(10) :: data = (/1.0, 2.0, 3.0, 4.0, 5.0, 5.0, 4.0, 3.0, 2.0, 1.0/)
    integer :: i
    call fft1D(data)
    do i=1,10
       write(*,*) data(i)
    end do
  end program test

Running this in fortran gives

C:\Users\Charlie\Desktop\fft>gmake
gfortran -J".\\mod" -fimplicit-none -Wuninitialized -g -Wall -Wextra -fbacktrace
 -fcheck=all -O0 -fopenmp -D_QUAD_PRECISION_ -cpp -c -o .\\obj\testFFT.o testFFT
.f90
gfortran -o .\\test -J".\\mod" -fimplicit-none -Wuninitialized -g -Wall -Wextra
-fbacktrace -fcheck=all -O0 -fopenmp -D_QUAD_PRECISION_ -cpp .\\obj\testFFT.o

C:\Users\Charlie\Desktop\fft>test.exe
 (  30.000000000000000     ,  0.0000000000000000     )
 ( -9.4721355260035178     , -3.0776825738331275     )
 (  1.2032715918097736E-007,  8.7422769579070803E-008)
 (-0.52786408204828272     ,-0.72654221835813126     )
 (  5.6810824045072650E-008,  1.7484555003832725E-007)
 (  1.0325074129013956E-013,  2.6226834001115759E-007)
 ( -8.5216018574918451E-008,  2.6226836247200680E-007)
 (-0.52786395855490920     , 0.72654325051559143     )
 ( -4.8130813040669906E-007,  3.4969128892559098E-007)
 ( -9.4721398159606647     ,  3.0776922072585111     )

But running the same dataset in matlab gives

format long ; g = [1:5 5:-1:1]; fft(g)'

ans =

 30.000000000000000                     
 -9.472135954999580 + 3.077683537175253i
                  0                     
 -0.527864045000420 + 0.726542528005361i
                  0                     
                  0                     
                  0                     
 -0.527864045000420 - 0.726542528005361i
                  0                     
 -9.472135954999580 - 3.077683537175253i

I believe I'm using double precision using the selected_real_kind(14), but it looks like the result is only single precision at best. I'm sure that some of the sprinkled real(,cp)'s are not necessary, but I don't know where, how or why it looks like the result is single precision compared with matlab.

Any help is greatly appreciated!

UPDATE:

Using advice from the accepted answer, the only thing needed to change here was:

  real(cp),parameter :: PI = real(3.14159265358979,cp)

to

  real(cp),parameter :: PI = 3.14159265358979_cp

Upvotes: 1

Views: 1002

Answers (2)

DrOli
DrOli

Reputation: 1083

The accepted answer above is a reasonable solution if you need only something "near" double precision (Real(8)), since you have defined Pi explicitly to a number of digits close to a Real(8). The actual value of Pi to full Real(8) is

3.1415926535897931

c.f. your Parameter of

3.14159265358979

If you wish to have a more general "full precision consistent with _cp" result, you may wish to use something like

Real(cP), Parameter         :: Pi = 4_cp*ATan(1_cp)

***Edit: thank you francescalus for spotting the typos, this should be ATan(1._cp), though really, as explained below, "Parameter'ized" numbers should be used and avoid "_cp" all together in args etc.

Now, if _cp is, say 16, your Pi will be, automatically:

3.14159265358979323846264338327950280

Now, whenever you change _cp, your code will have the "full precision consistent with _cp" automatically.

BTW, you may also wish "Parameter'ize" certain basic numbers, eg.:

Real(cp), Parameter         :: Zero = 0_cp
Real(cp), Parameter         :: One = 1_cp
:

... etc

***Edit: thank you francescalus for spotting typos, but in these particular expressions, while 0.0_cp and 1.0_cp might be better, it does not matter since they Params declaration takes care of that>

Then, in your code, you could write, for example:

...
Real(cp), Parameter         :: Pi = Four*ATan(One)
...
S = Cmplx(Zero, Zero)*Exp(-Two) ! etc etc

and not have to worry about adding _cp etc. all over the place, and it is much easier to read/debug etc.

This strategy has certain other advantages, particularly in legacy code, such as

If( a == 1.0 ) Then

vs.

If( a == One ) Then

... but that is another story.

Incidentally, and to some extent a question of "style", in Fortran, arithmetic defaults to the "biggest" precision/type in the expression, so

S = S + x(k)*exp(real(-2.0,cp)*PI*j*real(k-1,cp)*real(i-1,cp)/real(N,cp))

should be equivalent to

S = S + x(k)*exp(-2*PI*j*(k-1)*(i-1)/N)

or, even better

S = S + x(k)*exp(-Two*PI*j*(k-1)*(i-1)/N)

Personally, I find the easier it is to read, the easier it is to get it right.

... though as stated above, if your arithmetic contains only Integers, then additional considerations may apply. However, if you "Parameter'ize" only Real's, then you should never have the risks (such as div/0) of using "2" where really a 2.0 is required, if you use "Two".

Finally, as you have "Parameter'ized" Pi, why not also 2*Pi, for example:

...
Real(cp), Parameter         :: TwoPi = Two*Pi

now, the expression

S = S + x(k)*exp(-Two*PI*j*(k-1)*(i-1)/N)

can be

S = S + x(k)*exp(-TwoPI*j*(k-1)*(i-1)/N)

which saves a little CPU. In fact, expanding on this logic could improve the performance of your code further.

... on a completely separate point, did you know your var Temp() can be "Automatic", with, say

complex(cp)                   :: temp(Size(x, Dim = 1))

then there is no need for Allocatable etc . Though from where I sit, it looks like it may be possible to write your code entirely without "S" and "Temp(:)", providing much simpler and faster code.

***** Addendum: Following some of the comments to my answer I thought it may help to show possible changes to the OP code to improve performance, and to some extent further improve precision.

However, before that, the OP did not indicate why a specific degree of precision is required. This can be an important question in many industrial/real world settings. For example, in some cases speed is much more important compared to precision, so long as the precision is sufficiently good to provide context specific reliability/decision making. For instance, it is conceivable that in some cases the calc should be all in Real(4). This matter requires a separate, and non-trivial, discussion.

Be that as it may, while Ross's answer corrected the precision problem in terms of "in memory representation", and my original answer improved precision in terms of "you actually need to provide sufficient sig figs to begin with" (even if the declarations are correct), reducing the number of FLOPS not only improves speed but also can assist with precision, since every FLOP introduces the possibility of truncation/round off, particularly for a long sequence of FLOPs in a single expression.

In addition, this OP issue could be resolved nicely with an "identity" formulation (4*ATan(1)). In many cases, that is either not possible or practical, and then some less elegant methods are required, but I leave those for another day.

The following is just one/two possible alternatives.

I have not tested it.

so it may need a tweaking; please feel free to suggest improvements.

With a bit of luck, it may reduce FLOPs by around 80% or so.

... I would be grateful to the OP to benchmark the different implementations and report results, if convenient. You may also wish to benchmark for _cp = 4, 8, 16, just to demonstrate the trade-off between speed and precision.

This alternative version would require the obvious updating to the calling s/r.

module fft_mod
!
public :: fft1D
!
Integer,parameter :: cp = selected_real_kind(14)        ! just out of curiosity, why "14", see my comments on "reality vs. precision"
!>>real(cp),parameter :: PI = real(3.14159265358979,cp)
!
Real(cp), Parameter     :: Zero = 0_cp            ! these types of declarations can be created in a separate Module, 
                          ! say "MyDeclarationModule", and then just access with "Use"
!
!XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
    ! NOTE: With respect to francescalus comments on issues with "1_cp" syntax, the usage here works fine as shown in the previous result,
! though francescalus' comments are fair in suggesting that other approaches make for better coding.
    ! However, and to be clear, I don't actually use "_xx" style myslef.  I have used it here ONLY to be consistent with the
    ! the OP's and earlier answers.  The usage of it here is NOT a recommendation.  A discussion of the alternatives
! is much too long and strays from the immediate issues too far.
! ... Perhaps francescalus will take up the mantle and re-write the OP's code for alternatives
!XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
    !
    Real(cp), Parameter     :: One = 1_cp
    Real(cp), Parameter     :: Two = 2_cp
    Real(cp), Parameter     :: Four = 4_cp
!
! ... after this point, there is no need to see "_cp" again, in this example
    !
    Real(cp), Parameter     :: Pi = Four*ATan(One)    ! this guarrantees maximal precision for Pi, up to "_cp"
!
    Real(cp), Parameter     :: TwoPi = Two*Pi     ! Vladimir: now, and only now (that I've talem most everything out of the loop),
                          ! this declaration has less value ... unlike previously, when it was
                          ! still in the inner NxN, and when it saved approx 15 - 20% of FLOPs
                              ! Crucially, if you do a lot of computational mathematics, TwoPi, PiBy2, Root2Pi, etc etc
                          ! arise with considerable frequency, and creating these as Parameters makes for much
                          ! improvement, and I would think it a practice to be encouraged.
    !
    Complex(cp), Parameter         :: SqrtM1 = Cmplx(Zero, One) ! sqrt(-1), this is your "j", 
                                ! sorry but "j" sounds just too much like an "integer" to me 
                                                                ! (for us "old timers"), so I changed the name to something more meaningful to me
    !
    !
Contains
!
Pure Subroutine fft1D(x, n, Temp)       ! OPTIONAL, return the results explicitly to save vector assignment, and preserve original data
                                      ! OPTIONAL, send the size of x() explicitly as n, much cleaner
                                      ! you could leave it as s/r fft1D(x), and most of the savings below would still apply
                                      ! ... also, my practice is to make everything Pure/Elemental where reasonable
    !
Integer, Intent(In)                :: n          ! Optional, but cleaner
!   complex(cp), Intent(InOut)     :: x(n)       ! consider as just Intent(In), and return Temp() as Intent(Out)
complex(cp), Intent(In)            :: x(n)       ! consider as just Intent(In), and return Temp() as Intent(Out)
    !
    ! Locals
    ! ------
    !
!       Complex(cp), Allocatable        :: temp(:)   ! no need for Allocatable
!       Complex(cp)                     :: temp(Size(x), Dim = 1) ! I prefer to pass n explicitly
    Complex(cp), Intent(Out)        :: temp(n)   ! make Intent(Out) to preserve data and save a vector assignment
!
!    complex(cp)                               :: S
!    integer                                   :: N
!    complex(cp)                               :: j ! sqrt(-1)
    !
Integer                             :: i, k
    !
    Complex(cp)                     :: TPSdivN  ! new reduce 4 nxn mults to 1
    !
    Complex(cp)                     :: TPSdivNiM1   ! new, to reduce 5 nxn mults to 1 nx1
    !
    Real(cp)                        :: rKM1(n)      ! new, to obviate nxn k-1's in inner loop
    !
!    N=size(x)                          ! my preference is to pass N explicitly, 
                    ! but can be done this way if prefered (need to make appropriate changes above)
!
!    allocate(temp(N))                  ! Temp() can be either automatic or an Arg, no need for Allocate
!    j = cmplx(0.0_cp,1.0_cp,cp)        ! make Parameter, and rename to SqrtM1 (personal pref)
!    S = cmplx(0.0_cp,0.0_cp,cp)        ! Not required with "improved" inner loop
    !
    !
    temp(1:n) = Cmplx(Zero, Zero)   ! S is not needed, just start with Temp() directly
    !
    TPSdivN = -TwoPi*SqrtM1/n       ! new, move everything out all loops that can be
    !
    ForAll( k=1:n) rKM1(k) = k - 1  ! new, this allows the elimination of NxN "k-1's" in the inner
                ! my preference is for ForAll's, but you could use Do, 
    !
do i = 1,N
    !
    TPSdivNiM1 = TPSdivN*(i-1)      ! new. everything out of inner loop that can be
    !
    ! improved, inner do, but could use "no-Do" alternative as illustrated below
    !
!      do k = 1,N
!>>        S = S + x(k)*exp(real(-2.0,cp)*PI*j*real(k-1,cp)*real(i-1,cp)/real(N,cp))    ! original, many unneccessary nxn FLOPs, type conversions, etc
!        temp(i) = temp(i) + x(k)*Exp(TPSdivNiM1*rKM1(k)))          ! use array of k-1's, then the inner k-loop is no longer required, 
                                                                    ! and can use Product()/Sum() or Dot_Product() intrinsics, see example below
!      End Do
    !
    ! alternative "direct array" approach to inner loop, or "no-DO" version ... there are other possibilities.
    !
    Temp(i) = Dot_Product( x(1:n), Exp(TPSdivNiM1*rKM1(1:n)) )  ! this approach can have certain advantages depending on circumstances
    !
!      temp(i) = S                      ! not needed
!      S = cmplx(0.0_cp,0.0_cp,cp)      ! not needed
    !
End Do
    !
!    x(1:n) = temp(1:n)                 ! not need if Temp() is Intent(Out) that save this vector assignment, and the original data
!    deallocate(temp)                   ! not needed
    !
End Subroutine fft1D
!
!  end module
End Module fft_mod 

Upvotes: 0

Ross
Ross

Reputation: 2179

The problem is how you define real numbers, specifically pi. When you define

real(cp),parameter :: PI = real(3.14159265358979,cp)

you're passing the argument 3.14159265358979 to the function real. But real numbers have default single precision so your real number is cast into single precision as it enters the function. Consider the following example:

  program main
  integer,parameter :: cp = selected_real_kind(14)
  real(cp),parameter :: pi = real(3.14159265358979,cp)
  real(cp),parameter :: pj = 3.14159265358979_cp

  write(*,*) pi
  write(*,*) pj

  end program main

Compiled with pgfortran and no options, this gives me:

3.141592741012573
3.141592653589790

When defining any real number, you should use []_cp to assign kind instead of real([],cp).

Edit: This problem also affects how you define 0.0, 1.0, and 2.0, but those numbers may be cast exactly into binary and do not suffer the same rounding error.

Upvotes: 2

Related Questions