Reputation: 2860
In a Scala program I wrote I have a scala.collection.Map
that maps a String to some calculated values (in detail it's Map[String, (Double, immutable.Map[String, Double], Double)]
- I know that's ugly and should (and will be) wrapped). Now, if I do this:
stats.map { case(c, (prior, pwc, denom)) => {
println(c)
...
}
}
it takes about 30 seconds to print out roughly 50 times a value of c
! The println
is just a test statement - the actual calculation I need was even slower (I aborted after 1 minute of complete silence). However, if I do it like this:
stats.mapValues { case (prior, pwc, denom) => {
println(prior)
...
}
}
I don't run into these performance issues ... Can anyone explain why this is happening? Am I not following some important Scala guidelines?
Thanks for the help!
Edit:
I further investigated the behaviour. My guess is that the bottleneck comes from accessin the Map
datastructure. If I do the following, I have have the same performance issues:
classes.foreach{c => {
println(c)
val ps = stats(c)
}
}
Here classes
is a List[String]
that stores the keys of the Map externally. Without the access to stats(c)
no performance losses occur.
Upvotes: 2
Views: 630
Reputation: 16324
mapValues
actually returns a view on the original map, which can lead to unexpected performance issues. From this blog post:
...here is a catch: map and mapValues are different in a not-so-subtle way. mapValues, unlike map, returns a view on the original map. This view holds references to both the original map and to the transformation function (here (_ + 1)). Every time the returned map (view) is queried, the original map is first queried and the tranformation function is called on the result.
I recommend reading the rest of that post for some more details.
Upvotes: 3