Andrew Willshire
Andrew Willshire

Reputation: 125

F# Recursion vs Iteration speed/overhead

I've been messing around a bit with Riccardo Terrell's Akka.NET Fractal demo (https://github.com/rikace/akkafractal) to try and understand it. (It's great, btw)

One thing I tried as a minor challenge for myself was to rewrite some bits of it in a more functional way. I've got it to work but it's much slower than the original.

Here's (more or less) the original Mandelbrot Set calculation adapted for testing:

    let mandelbrotSet (xp : int) (yp : int) (w : int) (h :int) (width : int) (height : int)
                  (maxr : float) (minr : float) (maxi : float) (mini : float) : List<int> =

    let mutable zx = 0.
    let mutable zy = 0.
    let mutable cx = 0.
    let mutable cy = 0.
    let mutable xjump = ((maxr - minr) / ( float width))

    let yjump = ((maxi - mini) / (float height))

    let mutable tempzx = 0.
    let loopmax = 1000
    let mutable loopgo = 0

    let outputList: int list = List.empty

    for x = xp to (xp + w) - 1 do
        cx <- (xjump * float x) - abs(minr)

        for y = yp to (yp + h) - 1 do
            zx <- 0.
            zy <- 0.
            cy <- (yjump * float y) - abs(mini)
            loopgo <- 0

            while (zx * zx + zy * zy <= 4. && loopgo < loopmax) do
                loopgo <- loopgo + 1
                tempzx <- zx
                zx <- (zx * zx) - (zy * zy) + cx
                zy <- (2. * tempzx * zy) + cy

            (List.append outputList [loopgo]) |> ignore

    outputList

And here's my version with the recursive mbCalc function doing the work:

let mandelbrotSetRec (xp : int) (yp : int) (w : int) (h :int) (width : int) (height : int)
    (maxr : float) (minr : float) (maxi : float) (mini : float) : List<int> =

    let xjump = ((maxr - minr) / (float width))
    let yjump = ((maxi - mini) / (float height))
    let loopMax = 1000
    let outputList: int list = List.empty

    let rec mbCalc(zx:float, zy:float, cx:float, cy:float, loopCount:int) = 
        match (zx * zx + zy * zy), loopCount with //The square of the magnitude of z
          | a,b when a > 4. || b = loopMax -> loopCount
          | _ -> mbCalc((zx * zx) - (zy * zy) + cx, (2. * zx * zy) + cy, cx, cy, loopCount+1) //iteration is the next value of z^2+c

    [|0..w-1|] //For each x...
        |> Array.map (fun x ->  let cx = (xjump * float (x+xp) - abs(minr))
                                [|0..h-1|] ///and for each y...
                                    |> Array.map (fun y -> let cy = (yjump * float (y+yp) - abs(mini))
                                                           let mbVal = mbCalc(0., 0., cx, cy,0)  //Calculate the number of iterations to convergence (recursively)
                                                           List.append outputList [mbVal]))|>ignore

    outputList

Is this just to be expected, pointlessly loading up an Actor with a load of recursive calls, or am I just doing something very inefficiently? Any pointers gratefully received!

If you want to run them then here's a little test script:

let xp = 1500
let yp = 1500
let w = 200
let h = 200
let width = 4000
let height = 4000

let timer1 = new System.Diagnostics.Stopwatch()
timer1.Start()
let ref = mandelbrotSet xp yp w h width height 0.5 -2.5 1.5 -1.5
timer1.Stop()

let timer2 = new System.Diagnostics.Stopwatch()
timer2.Start()
let test = mandelbrotSetRec xp yp w h width height 0.5 -2.5 1.5 -1.5
timer2.Stop

timer1.ElapsedTicks;;
timer2.ElapsedTicks;;
ref = test;;

EDIT: As per Philip's answer below, I added the list output quickly (too quickly!) to make something that ran in a script without requiring any imports. Here's the code to return the image:


    let mandelbrotSetRec (xp : int) (yp : int) (w : int) (h :int) (width : int) (height : int)
        (maxr : float) (minr : float) (maxi : float) (mini : float) : Image<Rgba32> =

        let img = new Image<Rgba32>(w, h)
        let xjump = ((maxr - minr) / (float width))
        let yjump = ((maxi - mini) / (float height))
        let loopMax = 1000

        //Precalculate the possible colour list
        let palette = List.append  ([0..loopMax - 1] |> List.map(fun c -> Rgba32(byte(c % 32 * 7), byte(c % 128 * 2), byte(c % 16 * 14)))) [Rgba32.Black]

        let rec mbCalc(zx:float, zy:float, cx:float, cy:float, loopCount:int) = 
            match (zx * zx + zy * zy), loopCount with //The square of the magnitude of z
              | a,b when a > 4. || b = loopMax -> loopCount
              | _ -> mbCalc((zx * zx) - (zy * zy) + cx, (2. * zx * zy) + cy, cx, cy, loopCount+1) //iteration is the next value of z^2+c

        [|0..w-1|] //For each x...
            |> Array.map (fun x ->  let cx = (xjump * float (x+xp) - abs(minr))
                                    [|0..h-1|] ///and for each y...
                                        |> Array.map (fun y -> let cy = (yjump * float (y+yp) - abs(mini))
                                                               let mbVal = mbCalc(0., 0., cx, cy,0)  //Calculate the number of iterations to convergence (recursively)
                                                               img.[x,y] <- palette.[mbVal]))|>ignore
        img

Upvotes: 2

Views: 493

Answers (1)

Phillip Carter
Phillip Carter

Reputation: 5005

Firstly, both functions return [] so there is no mandlebrot set being returned even if it's correctly calculated. List.append returns a list, it doesn't mutate an existing one.

Using a quick BenchmarkDotNet program below, where each function is in its own module:

open BenchmarkDotNet.Attributes
open BenchmarkDotNet.Running
open ActorTest

[<MemoryDiagnoser>]
type Bench() =
    let xp = 1500
    let yp = 1500
    let w = 200
    let h = 200
    let width = 4000
    let height = 4000    

    [<Benchmark(Baseline=true)>]
    member _.Mutable() =
        Mutable.mandelbrotSet xp yp w h width height 0.5 -2.5 1.5 -1.5

    [<Benchmark>]
    member _.Recursive() =
        Recursive.mandelbrotSet xp yp w h width height 0.5 -2.5 1.5 -1.5

[<EntryPoint>]
let main argv =
    let summary = BenchmarkRunner.Run<Bench>()
    printfn "%A" summary
    0 // return an integer exit code

Your code gave these results:

|    Method |     Mean |     Error |    StdDev | Ratio | RatioSD |    Gen 0 |    Gen 1 | Gen 2 | Allocated |
|---------- |---------:|----------:|----------:|------:|--------:|---------:|---------:|------:|----------:|
|   Mutable | 1.356 ms | 0.0187 ms | 0.0166 ms |  1.00 |    0.00 | 406.2500 |        - |     - |   1.22 MB |
| Recursive | 2.558 ms | 0.0303 ms | 0.0283 ms |  1.89 |    0.03 | 613.2813 | 304.6875 |     - |   2.13 MB |

I noticed that you're using Array.map but there's no results being captured anywhere, so changing that to Array.iter got your code to be nearly the same:

|    Method |     Mean |     Error |    StdDev | Ratio |    Gen 0 | Gen 1 | Gen 2 | Allocated |
|---------- |---------:|----------:|----------:|------:|---------:|------:|------:|----------:|
|   Mutable | 1.515 ms | 0.0107 ms | 0.0094 ms |  1.00 | 406.2500 |     - |     - |   1.22 MB |
| Recursive | 1.652 ms | 0.0114 ms | 0.0101 ms |  1.09 | 607.4219 |     - |     - |   1.82 MB |

This difference can probably be explained by the additional allocations done with the mapping calls. Allocations are expensive, especially when it's larger arrays, so it's best to avoid them if possible. Exact timings will differ from machine to machine but I'd expect a similar before/after ratio when using BenchmarkDotNet.

This could probably be further optimized by avoiding the list allocations and pre-allocating a list or array that you fill in. The same is true for the iterative call. Also looping through a Span<'T> will be faster than an array since it elides a bounds check, but you'd probably have to change the shape of your code a lot to do that.

Lastly, always use a statistical benchmarking tool like BenchmarkDotNet to measure performance in microbenchmarks like this. Quick scripts are fine as a starting point, but they're no substitute for a tool that accounts for execution time variability on a machine.

Upvotes: 10

Related Questions