Reputation: 259
I am implementing the Runge-Kutta-Fehlberg method with adaptive step-size (RK45). I define and call my Butcher tableau in a notebook with
module FehlbergTableau
using StaticArrays
export A, B, CH, CT
A = @SVector [ 0 , 2/9 , 1/3 , 3/4 , 1 , 5/6 ]
B = @SMatrix [ 0 0 0 0 0
2/9 0 0 0 0
1/12 1/4 0 0 0
69/128 -243/128 135/64 0 0
-17/12 27/4 -27/5 16/15 0
65/432 -5/16 13/16 4/27 5/144 ]
CH = @SVector [ 47/450 , 0 , 12/25 , 32/225 , 1/30 , 6/25 ]
CT = @SVector [ -1/150 , 0 , 3/100 , -16/75 , -1/20 , 6/25 ]
end
using .FehlbergTableau
If I code the algorithm for RK45 straightforwardly as
function infinitesimal_flow(A::SVector{6,Float64}, B::SMatrix{6,5,Float64}, CH::SVector{6,Float64}, CT::SVector{6,Float64}, t0::Float64, Δt::Float64, J∇H::Function, x0::SVector{N,Float64}) where N
k1 = Δt * J∇H( t0 + Δt*A[1], x0 )
k2 = Δt * J∇H( t0 + Δt*A[2], x0 + B[2,1]*k1 )
k3 = Δt * J∇H( t0 + Δt*A[3], x0 + B[3,1]*k1 + B[3,2]*k2 )
k4 = Δt * J∇H( t0 + Δt*A[4], x0 + B[4,1]*k1 + B[4,2]*k2 + B[4,3]*k3 )
k5 = Δt * J∇H( t0 + Δt*A[5], x0 + B[5,1]*k1 + B[5,2]*k2 + B[5,3]*k3 + B[5,4]*k4 )
k6 = Δt * J∇H( t0 + Δt*A[6], x0 + B[6,1]*k1 + B[6,2]*k2 + B[6,3]*k3 + B[6,4]*k4 + B[6,5]*k5 )
TE = CT[1]*k1 + CT[2]*k2 + CT[3]*k3 + CT[4]*k4 + CT[5]*k5 + CT[6]*k6
xt = x0 + CH[1]*k1 + CH[2]*k2 + CH[3]*k3 + CH[4]*k4 + CH[5]*k5 + CH[6]*k6
norm(TE), xt
end
and compare it with the more compact implementation
function infinitesimal_flow_2(A::SVector{6,Float64}, B::SMatrix{6,5,Float64}, CH::SVector{6,Float64}, CT::SVector{6,Float64}, t0::Float64,Δt::Float64,J∇H::Function, x0::SVector{N,Float64}) where N
k = MMatrix{N,6}(0.0I)
TE = zero(x0); xt = x0
for i=1:6
# EDIT: this is wrong! there should be a new variable here, as pointed
# out by Lutz Lehmann: xs = x0
for j=1:i-1
# xs += B[i,j] * k[:,j]
x0 += B[i,j] * k[:,j] #wrong
end
k[:,i] = Δt * J∇H(t0 + Δt*A[i], x0)
TE += CT[i]*k[:,i]
xt += CH[i]*k[:,i]B[i,j] * k[:,j]
end
norm(TE), xt
end
Then the first function, which defines variables explicitly, is much faster:
J∇H(t::Float64, X::SVector{N,Float64}) where N = @SVector [ -X[2]^2, X[1] ]
x0 = SVector{2}([0.0, 1.0])
infinitesimal_flow(A, B, CH, CT, 0.0, 1e-2, J∇H, x0)
infinitesimal_flow_2(A, B, CH, CT, 0.0, 1e-2, J∇H, x0)
@btime infinitesimal_flow($A, $B, $CH, $CT, 0.0, 1e-2, $J∇H, $x0)
>> 19.387 ns (0 allocations: 0 bytes)
@btime infinitesimal_flow_2($A, $B, $CH, $CT, 0.0, 1e-2, $J∇H, $x0)
>> 50.985 ns (0 allocations: 0 bytes)
I cannot find a type instability or anything to justify the lag, and for more complex tableaus it is mandatory that I use the algorithm in loop form. What am I doing wrong?
P.S.: The bottleneck in infinitesimal_flow_2
is the line k[:,i] = Δt * J∇H(t0 + Δt*A[i], x0)
.
Upvotes: 3
Views: 99
Reputation: 26040
Each stage of the RK method computes its evaluation point directly from the base point of the RK step. This is explicit in the first method. In the second method you would have to reset the point computation in each stage, such as in
for i=1:6
xs = x0
for j=1:i-1
xs += B[i,j] * k[:,j]
end
k[:,i] = Δt * J∇H(t0 + Δt*A[i], xs)
...
The slightest error in the step computation can catastrophically throw off the step-size controller, forcing the step size to fall towards zero and thus the effort to increase drastically. An example is the 4101 error in RKF45
Upvotes: 3