Dr Y Wit
Dr Y Wit

Reputation: 2020

Immutability and performance

I tried searching stackoverflow and there are a number of relevant topics including

val-mutable versus var-immutable in Scala

What is the difference between a var and val definition in Scala?

But I still want to clarify if my understanding is correct.

Looks like the rule is following

Prefer immutable val over immutable var over mutable val over mutable var. Especially immutable var over mutable val!

But performance of immutable var is much worse vs mutable val according to simple test in REPL

scala -J-Xmx2g

scala> import scala.collection.mutable.{Map => MMap}
import scala.collection.mutable.{Map=>MMap}

scala>

scala> def time[R](block: => R): R = {
     |   val t0 = System.nanoTime()
     |   val result = block    // call-by-name
     |   val t1 = System.nanoTime()
     |   println("Elapsed time: " + (t1 - t0) + "ns")
     |   result
     | }
time: [R](block: => R)R

scala>time {val mut_val = MMap[Int, Int](); for (i <- 1 to 1000000) mut_val += (i -> i)}
Elapsed time: 551073900ns

scala>time {var mut_var = MMap[Int, Int](); for (i <- 1 to 1000000) mut_var += (i -> i)}
Elapsed time: 574174400ns

scala>time {var imut_var = Map[Int, Int](); for (i <- 1 to 1000000) imut_var += (i -> i)}
Elapsed time: 860938800ns

scala>time {val mut_val = MMap[Int, Int](); for (i <- 1 to 2000000) mut_val += (i -> i)}
Elapsed time: 1103283000ns

scala>time {var mut_var = MMap[Int, Int](); for (i <- 1 to 2000000) mut_var += (i -> i)}
Elapsed time: 1166532600ns

scala>time {var imut_var = Map[Int, Int](); for (i <- 1 to 2000000) imut_var += (i -> i)}
Elapsed time: 2926954500ns

I added also mutable var even tough it does not make much sense. It's performance is quite similar to mutable val, just to complete the picture. But performance of immutable var is much worse.

So the price for immutable var vs mutable val is degraded performance (and more extensive memory usage!). Can someone please explain again what is the point of paying that price? Specific examples are appreciated.

Thanks

Upvotes: 0

Views: 741

Answers (1)

kag0
kag0

Reputation: 6054

There are a lot of ways to answer this question (and many comments to refute the claim that a mutable val always performs worse), but here's one in a phrase: scope and side effects.

The scope of an immutable var is limited to where it was declared, and passing it around as a parameter or assigning it to another variable usually behaves as expected.
A mutable val on the other hand will feel state changes no matter where it's used and passed.

Consider an application where you want to run several workers with different configurations that inherit from a default. (assume TIMEOUT_1 = t1 and PARALLEL_FACTOR_1 = pf1 and TIMEOUT_2 = t2, etc)

With immutable vars

var defaultConfig = immutable.Map("timeout" → "5", "parallelFactor" → "4")
var worker1Config = defaultConfig
var worker2Config = defaultConfig

worker1Config += "timeout" → sys.env("TIMEOUT_1")
worker1Config += "parallelFactor" → sys.env("PARALLEL_FACTOR_1")

worker2Config += "timeout" → sys.env("TIMEOUT_2")
worker2Config += "parallelFactor" → sys.env("PARALLEL_FACTOR_2")

println(defaultConfig) // Map(timeout -> 5, parallelFactor -> 4)
println(worker1Config) // Map(timeout -> t1, parallelFactor -> pf1)
println(worker2Config) // Map(timeout -> t2, parallelFactor -> pf2)

With mutable vals

val defaultConfig = mutable.Map("timeout" → "5", "parallelFactor" → "4")
val worker1Config = defaultConfig
val worker2Config = defaultConfig

worker1Config += "timeout" → sys.env("TIMEOUT_1")
worker1Config += "parallelFactor" → sys.env("PARALLEL_FACTOR_1")

worker2Config += "timeout" → sys.env("TIMEOUT_2")
worker2Config += "parallelFactor" → sys.env("PARALLEL_FACTOR_2")

println(defaultConfig) // Map(parallelFactor -> pf2, timeout -> t2)
println(worker1Config) // Map(parallelFactor -> pf2, timeout -> t2)
println(worker2Config) // Map(parallelFactor -> pf2, timeout -> t2)

You can see that even though almost all the code is the same, using mutable vals has introduced a non-obvious bug (especially if these sections of code were in different functions rather than all together).

Upvotes: 1

Related Questions