Reputation: 18614

A proper benchmark?

I wanted to measure, how long 2 different programs need to perform 1 task. One program used threads, the other didn't.The task was to count up to 2000000.

Class with threads:

public class Main {
    private int res1 = 0;
    private int res2 = 0;

    public static void main(String[] args) {
        Main m = new Main();

        long startTime = System.nanoTime();
        m.func();
        long endTime = System.nanoTime();

        long duration = endTime - startTime;
        System.out.println("duration: " + duration);
    }

    public void func() {
        Thread t1 = new Thread(new Runnable() {

            @Override
            public void run() {
                for (int i = 0; i < 1000000; i++) {
                    res1++;
                }
            }
        });

        Thread t2 = new Thread(new Runnable() {

            @Override
            public void run() {
                for (int i = 1000000; i < 2000000; i++) {
                    res2++;
                }
            }
        });

        t1.start();
        t2.start();

        System.out.println(res1 + res2);
    }
}

Class without threads:

public class Main {

    private int res = 0;

    public static void main(String[] args) {
        Main m = new Main();

        long startTime = System.nanoTime();
        m.func();
        long endTime = System.nanoTime();

        long duration = endTime - startTime;
        System.out.println("duration: " + duration);

    }

    public void func() {

        for (int i = 0; i < 2000000; i++) {
            res++;
        }
        System.out.println(res);
    }
}

After 10 measurement the average results (in nanoseconds) were:

With threads:    1952358
Without threads: 7941479

Am I doing it right?
How come, with 2 threads it's 4x faster and not only 2x?

Upvotes: 3

Answers (4)

Peter Lawrey

Reputation: 533880

The main reason the multi-thread version is faster is that you don't wait for the loop to finish. You only wait for the threads to start.

You need to add after start();

    t1.join();
    t2.join();

Once you do this you note that starting the threads takes so long at it's quite a bit slower. If you make your test 100x longer, the cost of starting the threads is not so important.

The single threaded example takes longer to be JItted properly. You need to make sure you run the test for at least 2 seconds, repeatedly

My multiple threaded version is

public class Main {
    private long res1 = 0;
    public long p0, p1, p2, p3, p4, p5, p6, p7;
    private long res2 = 0;

    public static void main(String[] args) throws InterruptedException {
        Main m = new Main();

        for (int i = 0; i < 10; i++) {
            long startTime = System.nanoTime();
            m.func();
            long endTime = System.nanoTime();

            long duration = endTime - startTime;
            System.out.println("duration: " + duration);
        }
        assert m.p0 + m.p1 + m.p2 + m.p3 + m.p4 + m.p5 + m.p6 + m.p7 == 0;
    }

    public void func() throws InterruptedException {
        Thread t1 = new Thread(new Runnable() {
            @Override
            public void run() {
                for (int i = 0; i < 1000000000; i++) {
                    res1++;
                }
            }
        });

        Thread t2 = new Thread(new Runnable() {
            @Override
            public void run() {
                for (int i = 1000000000; i < 2000000000; i++) {
                    res2++;
                }
            }
        });

        t1.start();
        t2.start();
        t1.join();
        t2.join();
        System.out.println(res1 + res2);
    }
}

prints the following for multi-threaded tests.

2000000000
duration: 179014396
4000000000
duration: 148814805
.. deleted ..
18000000000
duration: 61767861
20000000000
duration: 72396259

For the single threaded version I comment out one thread and get

2000000000
duration: 266228421
4000000000
duration: 255203050
... deleted ...
18000000000
duration: 125434383
20000000000
duration: 125230354

As expected, when run long enough two threads are almost twice as fast as one.

In short,

multi-threaded code can have smaller delays for the current thread if you don't wait for those operation to complete e.g. asynchronous logging and messaging.
single threaded coding can be much faster (and simpler) than multi-threaded code unless you have a significant CPU bound tasks to perform (or you can do concurrent IO)
Running the test repeatedly in the same JVM can give different results

Upvotes: 2

Jeff

Reputation: 12795

There are a couple of tricks you need to remember when benchmarking in java.

The first this is the same when benchmarking anything: one run may just happen to be slower than another, for no meaningful reason. To avoid this, run multiple times and take an average (and I mean lots of times).

The second may not be unique to java, but might be surprising: java VMs can take time to "warm up" - if you run your code a hundred times, the compiled code can change according to what code paths are extremely common. To battle this, run the code many times before you start taking stats.

How long it takes to warm up depends on your JVM settings - I can't quite remember off the top of my head.

This is, of course, quite apart from the problem that the other answers have pointed out that you're not actually measuring the threaded program.

EDIT: Another thing to be careful of is the compiler realising that any particular variable/loop/entire program is completely pointless. In these situations it's likely to just completely delete it - you might find that you need to use res1 and res2 or else your loops may be completely removed from the compiled code.

EDIT: Just realized that you do actually use all of your counting variables - it's still a useful thing to know, though, so I'll leave it in.

Upvotes: 1

Philipp

Reputation: 69773

In the lines

    t1.start();
    t2.start();

you are starting the thread execution, but you aren't actually waiting for them to finish before you take your time measurement. To wait until the threads are finished, call

   t1.join();
   t2.join();

The join method will block until the thread is finished. Then measure the execution time.

Upvotes: 8

hmatar

Reputation: 2429

In parallel version you are measuring how much main thread creates the other two threads. You are not measuring their execution times. That is why you are getting super-linear speedup. In order to include their execution times you have to join them with the main thread.

Add these lines after t2.start();

     t1.join();  // wait until thread t1 terminates
     t2.join(); // wait until thread t2 terminates

Upvotes: 5

A proper benchmark?

Answers (4)

Related Questions