Reputation: 48
I'm using the following algorithm to solve a cubic polynomial equation (x^3 + ax^2 + bx + c = 0):
function find_roots(a, b, c, lower_bound, upper_bound)
implicit none
real*8, intent(in) :: a, b, c, lower_bound, upper_bound
real*8 :: find_roots
real*8 :: Q, R, theta, x, Au, Bu
integer :: i, iter
Q = (a**2 - 3.D0*b)/9.D0
R = (2.D0*a**3 - 9.D0*a*b + 27.D0*c)/54.D0
!If roots are all real, get root in range
if (R**2.lt.Q**3) then
iter = 0
theta = acos(R/sqrt(Q**3))
!print *, "theta = ", theta
do i=-1,1
iter = iter+1
x = -2.D0*sqrt(Q)*cos((theta + dble(i)*PI*2.D0)/3.D0)-a/3.D0
!print *, "iter = ", iter, "root = ", x
if ((x.ge.lower_bound).and.(x.le.upper_bound)) then
find_roots = x
return
end if
end do
!Otherwise, two imaginary roots and one real root, return real root
else
Au = -sign(1.D0, R)*(abs(R)+sqrt(R**2-Q**3))**(1.D0/3.D0)
if (Au.eq.0.D0) then
Bu = 0.D0
else
Bu = Q/Au
end if
find_roots = (Au+Bu)-a/3.D0
return
end if
end function find_roots
Now it turns out that it can be shown analytically that a cubic equation with the following inputs:
Q0 = 1.D0
alpha = 1.D-2
dt = 0.00001D0
Y = 1000000.D0
find_roots(-(2.D0*Q0+Y), &
-(alpha-Q0**2-2.D0*Y*Q0+dt/2.D0*alpha), &
(dt/2.D0*alpha*Q0+Y*alpha-Y*Q0**2), &
Q0-sqrt(alpha), &
Q0+sqrt(alpha)))
MUST have a root between Q0+sqrt(alpha) and Q0-sqrt(alpha). This is a mathematical certainty. However, the function as called above will return 0, not the correct root, due to floating-point error, since the required result is very close to Q0+sqrt(alpha). I've confirmed this by creating a new function which uses quadruple precision. Unfortunately, I can't just always use quadruple precision since this function will be called billions of times and is a performance bottleneck.
So my question is, are there any general ways I could re-write this code to reduce these precision errors, while also maintaining the performance? I tried using the algorithm suggested by wikipedia, but the problem actually got worse.
Upvotes: 1
Views: 397
Reputation: 203
There is something wrong with the accuracy of your calculations, either the calculation of a,b,c or the find_roots function estimates.
I used the a,b,c that are calculated and found that your lower_bound and upper_bound were better estimates of the roots.
I then modified the bounds to be +/- sqrt(alpha)*1.1 so that the range test would work for 64-bit. I also simplified constants that promote exactly to double.
Finally I compared your estimate of the root to the fn (0.9d0) and fn (1.1d0), which shows the find_roots function does not work for the a,b,c provided.
You should check your references for the error or it may just be the approach fails when acos (+/- 1.0 ) is used.
The program I used to test this with lots of prints is:
real*8 function find_roots (a, b, c, lower_bound, upper_bound)
implicit none
real*8, intent(in) :: a, b, c, lower_bound, upper_bound
real*8 :: Q, R, theta, x, Au, Bu, thi
integer :: i, iter
real*8 :: two_pi ! = 8 * atan (1.0d0)
Q = (a**2 - 3.*b)/9.
R = (2.*a**3 - 9.*a*b + 27.*c)/54.
two_pi = 8 * atan (1.0d0)
!If roots are all real, get root in range
if (R**2 < Q**3) then
iter = 0
x = R/sqrt(Q**3)
theta = acos(x)
print *, "theta = ", theta, x
do i=-1,1
iter = iter+1
!! x = -2.D0*sqrt(Q)* cos((theta + dble(i)*PI*2.D0)/3.D0) - a/3.D0
thi = (theta + i*two_pi)/3.
x = -2.*sqrt(Q) * cos (thi) - a/3.
!print *, "iter = ", iter, "root = ", x
if ( (x >= lower_bound) .and. (x <= upper_bound) ) then
find_roots = x
print *, "find_roots = ", x
! return
end if
end do
!Otherwise, two imaginary roots and one real root, return real root
else
Au = -sign(1.D0, R)*(abs(R)+sqrt(R**2-Q**3))**(1.D0/3.D0)
if (Au.eq.0.D0) then
Bu = 0.D0
else
Bu = Q/Au
end if
find_roots = (Au+Bu)-a/3.D0
return
end if
end function find_roots
real*8 function get_cubic (x, a, b, c)
implicit none
real*8, intent(in) :: x, a, b, c
get_cubic = ( ( x + a) * x + b ) * x + c
end function get_cubic
! Now it turns out that it can be shown analytically that a cubic equation with the following inputs:
real*8 Q0, alpha, dt, Y, a, b, c, lower_bound, upper_bound, val, fn
real*8, external :: find_roots, get_cubic
!
Q0 = 1.D0
alpha = 1.0D-2
dt = 0.00001D0
Y = 1000000.0D0
!
a = -(2.*Q0 + Y)
b = -(alpha - Q0**2 - 2.*Y*Q0 + dt/2.*alpha)
c = (dt/2.*alpha*Q0 + Y*alpha - Y*Q0**2)
write (*,*) a,b,c
!
lower_bound = Q0-sqrt(alpha)*1.1
upper_bound = Q0+sqrt(alpha)*1.1
write (*,*) lower_bound, upper_bound
!
val = find_roots (a, b, c, lower_bound, upper_bound)
!
fn = get_cubic ( val, a,b,c )
write (*,*) val, fn
!
! Test the better root values
val = 0.9d0
fn = get_cubic ( val, a,b,c )
write (*,*) val, fn
!
val = 1.1d0
fn = get_cubic ( val, a,b,c )
write (*,*) val, fn
end
Upvotes: 0
Reputation: 620
https://www.cliffsnotes.com/study-guides/algebra/algebra-ii/factoring-polynomials/sum-or-difference-of-cubes This should reduce rounding error. Likewise, you should be able to find a much better grouping of terms, where you don't make the compiler guess what you want, https://en.wikipedia.org/wiki/Horner%27s_method alpha-Q0**2-2.D0*Y*Q0+dt/2.D0*alpha /= (alpha+alpha*.5*dt)-Q0*(Q0+2*Y) You might argue that any good optimizer should know what to do with .5dt vs. dt/2. ifort considers that a part of -no-prec-div even though it can't change roundoff. It's up to you whether you choose single precision constants for readability after checking to make sure that the promotion rules cause them to promote exactly to double. It seems particularly bad style to depend on f77 D0 suffix to choose the same data type as the never-standard real*8; no doubt it does if your compiler doesn't complain.
Upvotes: 1