Performance of Lazy in F#

Question

Why is the creation of Lazy type so slow?

Assume the following code:

type T() =
  let v = lazy (0.0)
  member o.a = v.Value

type T2() =
  member o.a = 0.0

#time "on"

for i in 0 .. 10000000 do
  T() |> ignore

#time "on"

for i in 0 .. 10000000 do
  T2() |> ignore

The first loop gives me: Real: 00:00:00.647 whereas the second loop gives me Real: 00:00:00.051. Lazy is 13X slower!!

I have tried to optimize my code in this way and I ended up with simulation code 6X slower. It was then fun to track back where the slow down occurred...

John Palmer · Accepted Answer

The Lazy version has some significant overhead code -

 60     .method public specialname
 61            instance default float64 get_a ()  cil managed
 62     {
 63         // Method begins at RVA 0x2078
 64     // Code size 14 (0xe)
 65     .maxstack 3
 66     IL_0000:  ldarg.0
 67     IL_0001:  ldfld class [FSharp.Core]System.Lazy`1 Test/T::v
 68     IL_0006:  tail.
 69     IL_0008:  call instance !0 class [FSharp.Core]System.Lazy`1::get_Value()
 70     IL_000d:  ret
 71     } // end of method T::get_a

Compare this to the direct version

 .method public specialname
130            instance default float64 get_a ()  cil managed
131     {
132         // Method begins at RVA 0x20cc
133     // Code size 10 (0xa)
134     .maxstack 3
135     IL_0000:  ldc.r8 0.
136     IL_0009:  ret
137     } // end of method T2::get_a

So the direct version has a load and then return, whilst the indirect version has a load then a call and then a return.

Since the lazy version has an extra call I would expect it to be significantly slower.

UPDATE: So I wondered if we could create a custom version of lazy which did not require the method calls - I also updated the test to actual call the method rather than just create the objects. Here is the code:

type T() =
  let v = lazy (0.0)
  member o.a() = v.Value

type T2() =
  member o.a() = 0.0

type T3() = 
  let mutable calculated = true
  let mutable value = 0.0
  member o.a() = if calculated then value else failwith "not done";;

#time "on"
let lazy_ = 
  for i in 0 .. 1000000 do
    T().a() |> ignore
  printfn "lazy"
#time "on"
let fakelazy = 
  for i in 0 .. 1000000 do
    T3().a() |> ignore
  printfn "fake lazy"

#time "on"
let direct = 
  for i in 0 .. 1000000 do
    T2().a() |> ignore
  printfn "direct";;

Which gives the following result:

lazy
Real: 00:00:03.786, CPU: 00:00:06.443, GC gen0: 7

val lazy_ : unit = ()


--> Timing now on

fake lazy
Real: 00:00:01.627, CPU: 00:00:02.858, GC gen0: 2

val fakelazy : unit = ()


--> Timing now on

direct
Real: 00:00:01.759, CPU: 00:00:02.935, GC gen0: 2

val direct : unit = ()

Here the lazy version is only 2x slower than the direct version and the fake lazy version is even slightly faster than the direct version - this is probably due to a GC happening during the benchmark.

Performance of Lazy in F#

Answers (2)

Related Questions