matanox
matanox

Reputation: 13686

Scala collections, references, and memory efficiency

Is it so that Scala collections can only be initialized/assigned the literal values of vals, vars and literals, and not the vals/vars themselves?

I.e. the list b below will be (4, 3), and there would be no way to alternatively reference a from a collection rather than host its value in the collection?

val a = 3
val b = List(4, a)

Should I assume the only way to "accomplish" such referencing is to switch from data to objects, as objects are mostly referenced by default? using objects may not be very computation and memory efficient, where performance matters.

And regarding the pure performance aspects, say a were some large collection and not just a number, would Scala duplicate its "contents" in memory when the above initialization of b took place?

Thanks!

Upvotes: 1

Views: 854

Answers (2)

matanox
matanox

Reputation: 13686

thanks for the great answer @Gabriele Petronella. This couldn't format as a comment so I just add it here as an augmentation.

I think that further to the point of references, the following can also be quite useful, although eq is a bit harder to plug into here.

scala> val a = MutableList(1,2) 
a: scala.collection.mutable.MutableList[Int] = MutableList(1, 2) 

scala> val b = List(a, 3, 4) 
b: List[Any] = List(MutableList(1, 2), 3, 4) 

scala> a += 100
res20: a.type = MutableList(1, 2, 100) 

scala> b 
res21: List[Any] = List(MutableList(1, 2, 100), 3, 4) 

Not sure how to phrase the relationship to referential transparency - but it really relates to the original question

Upvotes: 0

Gabriele Petronella
Gabriele Petronella

Reputation: 108091

When you invoke the constructor of List, its arguments are evaluated immediately (call-by-value), hence the object 3 will be stored in the List.

In general, everything in scala is an object (leaving aside JVM representation details), so what you are storing is an immutable reference to 3, which is immutable in turn.

Also, note that - due to referencial transparency (a is a constant reference to 3) - storing a reference to a or the object it references to doesn't make a difference, i.e. it's "transparent".

So, if you instead want an opaque reference to something you can change later on, you can always store a constant reference to a mutable object:

scala> class Foo(var foo: Int)
defined class Foo

scala> val x = new Foo(42)
x: Foo = Foo@3a654e77

scala> val a = List(x)
a: List[Foo] = List(Foo@3a654e77)

scala> a.head.foo
res25: Int = 42

scala> x.foo = 43
x.foo: Int = 43

scala> a.head.foo
res26: Int = 43

but boy that's evil!


As per the performance question, if a is a large immutable collection, referencial transparency allows for reusing the existing collection a when constructing b, without pessimistically copy it. Since a cannot mutate, there's no need for cloning at all.

You can easily test this in the REPL:

Let's create an immutable collection a

scala> val a = List(1, 2)
a: List[Int] = List(1, 2)

Let's use a for creating b

scala> val b = List(a, List(3, 4))
b: List[List[Int]] = List(List(1, 2), List(3, 4))

The first element of b is exactly the same a we put it

scala> b.head eq a
res18: Boolean = true

Note that eq compares reference equality, so the above is not just a copy of a. Further proof:

scala> List(1, 2) eq a
res19: Boolean = false

Upvotes: 2

Related Questions