Reputation: 93
I have a function I'm trying to do a flop count on , but I keep getting 2n instead of n^2. I know its supposed to be n^2 based on the fact it's still a nxn triangular system that is just being solved in column first order. I'm new to linear algebra so please forgive me. I'll include the function , as well as all my work shown the best I can. Column BS Function
function col_bs(U, b)
n = length(b)
x = copy(b)
for j = n:-1:2
if U[j,j] == 0
error("Error: Matrix U is singular.")
end
x[j] = x[j]/U[j,j]
for i=1:j-1
x[i] = x[i] - x[j] * U[i , j ]
end
end
x[1] = x[1]/U[1,1]
return x
end
The ð loop does: âð=ðâ112
1 flop for the division ðĨ[ð]/=ð[ð,ð]
Inside the for ð loop in total does: 1+âð=ðâ112
The ð loop itself does: âð=2ð(1+âð=ðâ112))
Then one final flop for ðĨ[1]=ðĨ[1]/ð[1,1].
Finally we have 1+(âð=2ð(1+âð=ðâ112))).
Which we can now break down.
If we distribute and simplify 1+(âð=2ð+âð=2ðâð=ðâ112).
We can look at only the significant variables and ignore constants,
1+(ð+ð(1))(ð+ð(1))ð+ð2ð
Which then means that if we ignore constants the highest possibility of flops for this formula would be ð ( which may be a hint to whats wrong with my function since it should be ð2 just like the rest of our triangular systems I believe) Proof
Upvotes: 4
Views: 217
Reputation: 428
Since the code has two nested for-loops, each one proportional to n
, a quadratic runtime can be expected.
using LinearAlgebra, BenchmarkTools
function col_bs(U, b)
n = length(b)
x = copy(b)
for j = n:-1:2 # n*(
if U[j,j] == 0
error("Error: Matrix U is singular.")
end
x[j] = x[j]/U[j,j] # 1 +
for i=1:j-1 # n*(
x[i] = x[i] - x[j] * U[i , j ] # 2
end # )
end # ) +
x[1] = x[1]/U[1,1] # 1
return x # = n*(1+2*n) + 1 = O(n^2)
end
for n in 2 .^[1:10...]
local U = triu(randn(n,n))
local b = randn(n,1)
@btime col_bs($U, $b)
end
nicely approaches the expected factor-of-4 increase in runtime:
61.366 ns (1 allocation: 80 bytes)
90.900 ns (1 allocation: 96 bytes)
147.557 ns (1 allocation: 128 bytes)
280.000 ns (1 allocation: 192 bytes)
1.100 Ξs (1 allocation: 336 bytes)
3.900 Ξs (1 allocation: 576 bytes)
15.600 Ξs (1 allocation: 1.06 KiB)
56.800 Ξs (1 allocation: 2.12 KiB)
219.200 Ξs (1 allocation: 4.12 KiB)
957.200 Ξs (1 allocation: 8.12 KiB)
Upvotes: 2