Reputation: 803
I read this post and realized that loops are faster in Julia. Thus, I decided to change my vectorized code into loops. However, I had to use a few if statements in my loop but my loops slowed down after I added more such if statements.
Consider this excerpt, which I directly copied from the post:
function devectorized()
a = [1.0, 1.0]
b = [2.0, 2.0]
x = [NaN, NaN]
for i in 1:1000000
for index in 1:2
x[index] = a[index] + b[index]
end
end
return
end
function time(N)
timings = Array(Float64, N)
# Force compilation
devectorized()
for itr in 1:N
timings[itr] = @elapsed devectorized()
end
return timings
end
I then added a few if statements to test the speed:
function devectorized2()
a = [1.0, 1.0]
b = [2.0, 2.0]
x = [NaN, NaN]
for i in 1:1000000
for index in 1:2
####repeat this 6 times
if index * i < 20
x[index] = a[index] - b[index]
else
x[index] = a[index] + b[index]
end
####
end
end
return
end
I repeated this block six times:
if index * i < 20
x[index] = a[index] - b[index]
else
x[index] = a[index] + b[index]
end
For the sake of conciseness, I'm not repeating this block in my sample code. After repeating the if statements 6 times, devectorized2()
took 3 times as long.
I have two questions:
Upvotes: 1
Views: 1377
Reputation: 12664
Firstly, I don't think the performance here is very odd, since you're adding a lot of work to your function.
Secondly, you should actually return x
here, otherwise the compiler might decide that you're not using x
, and just skip the whole computation, which would thoroughly confuse the timings.
Thirdly, to answer your question 1: You can implement it like this:
x[index] = a[index] + ifelse(index * i < 20, -1, 1) * b[index]
This can be faster in some cases, but not necessarily in your case, where the branch is very easy to predict. Sometimes you can also get speedups by using Bool
s, for example like this:
x[index] = a[index] + (2*(index * i >= 20)-1) * b[index]
Again, in your example this doesn't help much, but there are cases when this approach can give you a decent speedup.
BTW: It isn't necessarily always true that loops are preferable to vectorized code any longer. The post you linked to is quite old. Take a look at this blog post, which shows how vectorized code can achieve similar performance to loopy code. In many cases, though, a loop is the clearest, easiest and fastest way to accomplish your goal.
Upvotes: 2