NPS
NPS

Reputation: 6365

Fastest way to read/write an array from/to a file?

I know there were several similar threads here and on the net but I seem to be doing something wrong, I guess. My task is easy - write (and later read) a big array of integers (int [] or ArrayList or what you think is best) to a file. The faster the better. My concrete array has about 4.5M integers in it and currently the times are for example (in ms):

This is unacceptable and I guess the times should be much lower. What am I doing wrong? I don't need the fastest method on earth but getting these times to about 5 - 15 seconds (less is welcome but not mandatory) is my goal.

My current code:

long start = System.nanoTime();

Node trie = dawg.generateTrie("dict.txt");
long afterGeneratingTrie = System.nanoTime();
ArrayList<Integer> array = dawg.generateArray(trie);
long afterGeneratingArray = System.nanoTime();

try
{
    new ObjectOutputStream(new FileOutputStream("test.txt")).writeObject(array);
}
catch (Exception e)
{
    Logger.getLogger(DawgTester.class.getName()).log(Level.SEVERE, null, e);
}
long afterSavingArray = System.nanoTime();

ArrayList<Integer> read = new ArrayList<Integer>();
try
{
    read = (ArrayList)new ObjectInputStream(new FileInputStream("test.txt")).readObject();
}
catch (Exception e)
{
    Logger.getLogger(DawgTester.class.getName()).log(Level.SEVERE, null, e);
}
long afterLoadingArray = System.nanoTime();

System.out.println("Generating trie: " + 0.000001 * (afterGeneratingTrie - start));
System.out.println("Generating array: " + 0.000001 * (afterGeneratingArray - afterGeneratingTrie));
System.out.println("Saving array: " + 0.000001 * (afterSavingArray - afterGeneratingArray));
System.out.println("Loading array: " + 0.000001 * (afterLoadingArray - afterSavingArray));

Upvotes: 0

Views: 2132

Answers (2)

obataku
obataku

Reputation: 29656

Something like the following is probably a fairly fast option. You should also use an actual array int[] rather a ArrayList<Integer> if you're concern is reducing overhead.

final Path path = Paths.get("dict.txt");
...
final int[] rsl = dawg.generateArray(trie);
final ByteBuffer buf = ByteBuffer.allocateDirect(rsl.length << 2);

final IntBuffer buf_i = buf.asIntBuffer().put(rsl).flip();
try (final WritableByteChannel out = Files.newByteChannel(path,
    StandardOpenOptions.WRITE, StandardOpenOptions.TRUNCATE_EXISTING)) {
  do {
    out.write(buf);
  } while (buf.hasRemaining());
}

buf.clear();
try (final ReadableByteChannel in = Files.newByteChannel(path,
    StandardOpenOptions.READ)) {
  do {
    in.read(buf);
  } while (buf.hasRemaining());
}
buf_i.clear();
buf_i.get(rsl);

Upvotes: 0

jtahlborn
jtahlborn

Reputation: 53694

Don't use java Serialization. it is very powerful and robust, but not particularly speedy (or compact). use a simple DataOutputStream and call writeInt(). (make sure you use a BufferedOutputStream between DataOutputStream and FileOutputStream).

if you want to pre-size your array on read, write your first int as the array length.

Upvotes: 3

Related Questions