amirkr
amirkr

Reputation: 173

Hard drive benchmarking with Java, getting unreasonably fast results

I wrote a code to benchmark the hard drive. It is fairly simple: write a large chunk of bytes (2500 * 10M) to disk with a BufferedOutputStream, then read it with a BufferedInputStream. I am writing 2500 bytes each time for 10M times to simulate conditions in another program I wrote. There is also another global variable, "meaningless", that is calculated with the read bytes - it has absolutely no meaning, and is only used to force the execution to actually read and use the bytes (to avoid situations where the bytes are not read due to some optimization).

The code is run 4 times and outputs the results.

Here it is:

import java.io.*;

public class DriveTest
{
    public static long meaningless = 0;

    public static String path = "C:\\test";

    public static int chunkSize = 2500;

    public static int iterations = 10000000;

    public static void main(String[] args)
    {
        try
        {
            for(int i = 0; i < 4; i++)
            {
                System.out.println("Test " + (i + 1) + ":");
                System.out.println("==================================");

                write();
                read();

                new File(path).delete();

                System.out.println("==================================");
            }
        }
        catch(Exception e)
        {
            e.printStackTrace();
        }
    }

    private static void write() throws Exception
    {
        BufferedOutputStream bos = new BufferedOutputStream(
                                   new FileOutputStream(new File(path)));

        long t1 = System.nanoTime();

        for(int i = 0; i < iterations; i++)
        {
            byte[] data = new byte[chunkSize];

            for(int j = 0; j < data.length; j++)
            {
                data[j] = (byte)(j % 127);
            }

            bos.write(data);
        }

        bos.close();

        long t2 = System.nanoTime();

        double seconds = ((double)(t2 - t1) / 1000000000.0);

        System.out.println("Writing took " + (t2 - t1) + 
                           " ns (" + seconds + " seconds).");

        System.out.println("Write rate " + (((double)chunkSize * 
                           iterations / seconds) / 
                           (1024.0 * 1024.0)) + " MB/s.");
    }

    private static void read() throws Exception
    {
        BufferedInputStream bis = new BufferedInputStream(
                                  new FileInputStream(new File(path)));

        long t1 = System.nanoTime();

        byte[] data;

        for(int i = 0; i < iterations; i++)
        {
            data = new byte[chunkSize];

            bis.read(data);

            meaningless += data[i % chunkSize];
        }

        bis.close();

        long t2 = System.nanoTime();

        System.out.println("meaningless is: " + meaningless + ".");

        double seconds = ((double)(t2 - t1) / 1000000000.0);

        System.out.println("Reading Took " + (t2 - t1) + 
                           " ns, which is " + 
                           seconds + " seconds.");

        System.out.println("Read rate " + (((double)chunkSize * 
                           iterations / seconds) / 
                           (1024.0 * 1024.0)) + " MB/s.");
    }
}

The problem here is twofold:

  1. When iterations = 10M (writing ~23GB to disk), the regular 7200 RPM drive gives very fast results, that are higher than the specs:

_

Test 1:
Writing took 148738975163 ns (148.738975163 seconds).
Write rate 160.29327810029918 MB/s.
meaningless is: 1246080000.
Reading Took 139143051529 ns, which is 139.143051529 seconds.
Read rate 171.34781541848795 MB/s.

Test 2:
Writing took 146591885655 ns (146.591885655 seconds).
Write rate 162.64104799270686 MB/s.
meaningless is: 1869120000.
Reading Took 139845492688 ns, which is 139.845492688 seconds.
Read rate 170.48713871206587 MB/s.

Test 3:
Writing took 152049678671 ns (152.049678671 seconds).
Write rate 156.8030798785472 MB/s.
meaningless is: 2492160000.
Reading Took 140152776858 ns, which is 140.152776858 seconds.
Read rate 170.11334662539255 MB/s.

Test 4:
Writing took 151363950081 ns (151.363950081 seconds).
Write rate 157.51344951950355 MB/s.
meaningless is: 3115200000.
Reading Took 139176911081 ns, which is 139.176911081 seconds.
Read rate 171.30612919179143 MB/s.

This seems odd - can the disk actually achieve such speeds? I seriously doubt that, given that the tested specs (and not even under a java output/input streams which - to my novice opinion - should not be optimal!) are lower: http://hdd.userbenchmark.com/Toshiba-DT01ACA200-2TB/Rating/2736

  1. When iterations are set to 1M (1000000), the numbers go completely crazy:

_

Test 1:
Writing took 6918084976 ns (6.918084976 seconds).
Write rate 344.6308912490619 MB/s.
meaningless is: 62304000.
Reading Took 2060226375 ns, which is 2.060226375 seconds.
Read rate 1157.244572706543 MB/s.

Test 2:
Writing took 6970893036 ns (6.970893036 seconds).
Write rate 342.0201369756931 MB/s.
meaningless is: 124608000.
Reading Took 2013661185 ns, which is 2.013661185 seconds.
Read rate 1184.0054368508995 MB/s.

Test 3:
Writing took 7140592101 ns (7.140592101 seconds).
Write rate 333.89188981705496 MB/s.
meaningless is: 186912000.
Reading Took 2011346987 ns, which is 2.011346987 seconds.
Read rate 1185.367719456367 MB/s.

Test 4:
Writing took 7140064035 ns (7.140064035 seconds).
Write rate 333.91658384694375 MB/s.
meaningless is: 249216000.
Reading Took 2041787713 ns, which is 2.041787713 seconds.
Read rate 1167.6952387535623 MB/s.

What kind of caching sorcery is this?? (what kind of caching can cause the writing to be faster anyway??) :) And how can it be undone? I have read and written files of 2.3GB! this is some huge caching to do, if it is indeed the problem.

Thanks

Upvotes: 2

Views: 722

Answers (1)

brain
brain

Reputation: 5546

Your test is probably just reading and writing to the OS page cache. The small data size will fit in completely. The larger one won't but is getting flushed asynchronously by the OS. You should try using the OpenOptions.DSYNC and SYNC options.

Upvotes: 3

Related Questions