Make42
Make42

Reputation: 13088

When to use Scala Vector, when Scala Array?

If I need an Array with multiple degrees, I can't use a Vector. But let's consider the simple case of having only one degree: When to use Scala Vector, when Scala Array?

Upvotes: 2

Views: 2868

Answers (3)

Abhijit Sarkar
Abhijit Sarkar

Reputation: 24528

The Scala 3 official collections documentation doesn't even show or mention the Array type. It seems like an omission, I've created a ticket to get it fixed.

The Array API docs say:

Arrays are mutable, indexed collections of values. Array[T] is Scala's representation for Java's T[].

Vectors, on the other hand, are the immutable indexed collections.

However, there's the perhaps more popular ArrayBuffer, which, in fact, has a place in the official docs. So, if you're looking for mutability, should you use the Array or the ArrayBuffer? The short answer is, as always, it depends. ArrayBuffer is resizable, Array isn't. Arrays are specialized for built-in value types (except Unit), so Array[Int] is going to be more optimal than ArrayBuffer[Int] – the values won't have to be boxed.

See this SO answer for more details on the differences between ArrayBuffer and Array.

Upvotes: 0

Dibbeke
Dibbeke

Reputation: 478

When it comes to time and space complexity, arrays are surprisingly versatile. You might expect that arrays are slow with regard to inserts and deletes until you consider modern memory architectures. CPUs can prefetch and stream arrays straight from memory while performing linear operations on them, such as copying for an insert or delete. Most other data-structures requires expensive indirections, defeating prefetching caches.

Immutability

Since linear access to arrays is very fast, I often (for smaller arrays) consider them as immutable and copy them on write.

How to choose

When I consider a data-structure for a certain task, I start by analyzing the performance implemented as a simple array. Only after this first step, I weigh the benefits and penalties of existing abstractions, such as vectors. Possible benefits of other data structures might be readability, code complexity, performance at scale, opportunities for garbage collection, ease of serialization and cache coherence. Readability and code complexity are on the top of my list, and this often weighs in favor of abstract data structures such as Vectors, Lists, Streams and Maps.

Consider GPU acceleration

When starting with arrays, I always consider the possibility of GPU execution. For example, machine learning heavily relies on vector (not to be confused with Scala vector) and matrix operations (linear algebra), which is accelerated on GPU hardware and often less memory intensive.

Upvotes: 3

stefanobaghino
stefanobaghino

Reputation: 12804

Choosing a data structure is, as always, a matter of context.

First of all, you have to take into account the issue at hand, the access pattern you expect to have and the performance characteristics. The Scala documentation includes a great comparison. Both collection share the common trait of being indexed, allowing fast random access, but you'll notice some differences between the two.

A key difference between the two, as suggested in the comments, is that a Vector is an immutable collection, while Arrays are mutable.

Furthermore, Arrays in Scala are effectively mapped over Java native arrays, making it quite easy to write idiomatic Scala code that can be used by just as idiomatic Java code elsewhere.

For further details, both the Array and Vector pages of the official Scala documentation include a good description. You can learn even more in the documentation section reserved to collections.

Upvotes: 0

Related Questions