GrayR
GrayR

Reputation: 1385

Fastest way to deserialize Int vector in scala

I have a Vector which I use as a look-up table based on index of the element. It's really huge (30+ million elements). I want to store it on file system and every time I start my application, read it into some object. I currently see three options here:

  1. just write a file full of ints
  2. use binary format
  3. serialize object with Vector in it

What is the best approach here?

Upvotes: 1

Views: 86

Answers (1)

ArtemGr
ArtemGr

Reputation: 12567

Write a file full of ints, then use a wrapper around a memory-mapped file to read it.

class MMappedIntVector (mmap: java.nio.MappedByteBuffer) {
  def getInt (idx: Int): Int = mmap.getInt (idx * 4)
}
object MMappedIntVector {
  def load (path: String): MMappedIntVector = {
    // mmap here
  }
}

The downside, of course, is that the file format ends up being locked to the endianness of your CPU.

Upvotes: 1

Related Questions