tryingtosolve
tryingtosolve

Reputation: 803

My loops are slow. Is that because of if statements?

I read this post and realized that loops are faster in Julia. Thus, I decided to change my vectorized code into loops. However, I had to use a few if statements in my loop but my loops slowed down after I added more such if statements.

Consider this excerpt, which I directly copied from the post:

function devectorized()
    a = [1.0, 1.0]
    b = [2.0, 2.0]
    x = [NaN, NaN]

    for i in 1:1000000
        for index in 1:2
            x[index] = a[index] + b[index]
        end
    end

    return
end

function time(N)
    timings = Array(Float64, N)

    # Force compilation
    devectorized()

    for itr in 1:N
        timings[itr] = @elapsed devectorized()
    end

    return timings
end

I then added a few if statements to test the speed:

function devectorized2()
    a = [1.0, 1.0]
    b = [2.0, 2.0]
    x = [NaN, NaN]

    for i in 1:1000000
        for index in 1:2

             ####repeat this 6 times
            if index * i < 20
                x[index] = a[index] - b[index]
            else
                x[index] = a[index] + b[index]
            end
             ####

        end
    end

    return
end

I repeated this block six times:

            if index * i < 20
                x[index] = a[index] - b[index]
            else
                x[index] = a[index] + b[index]
            end

For the sake of conciseness, I'm not repeating this block in my sample code. After repeating the if statements 6 times, devectorized2() took 3 times as long.

I have two questions:

  1. Are there better ways to implement if statements?
  2. Why are if statements so slow? I know that Julia is trying to do loops in a way that matches C. Is Julia providing better "translation" between Julia and C and these if statements just made the translation process more difficult?

Upvotes: 1

Views: 1377

Answers (1)

DNF
DNF

Reputation: 12664

Firstly, I don't think the performance here is very odd, since you're adding a lot of work to your function.

Secondly, you should actually return x here, otherwise the compiler might decide that you're not using x, and just skip the whole computation, which would thoroughly confuse the timings.

Thirdly, to answer your question 1: You can implement it like this:

x[index] = a[index] + ifelse(index * i < 20, -1, 1) * b[index]

This can be faster in some cases, but not necessarily in your case, where the branch is very easy to predict. Sometimes you can also get speedups by using Bools, for example like this:

x[index] = a[index] + (2*(index * i >= 20)-1) * b[index]

Again, in your example this doesn't help much, but there are cases when this approach can give you a decent speedup.

BTW: It isn't necessarily always true that loops are preferable to vectorized code any longer. The post you linked to is quite old. Take a look at this blog post, which shows how vectorized code can achieve similar performance to loopy code. In many cases, though, a loop is the clearest, easiest and fastest way to accomplish your goal.

Upvotes: 2

Related Questions