Bilesh Ganguly
Bilesh Ganguly

Reputation: 4131

Vector taking less time to get populated than ArrayList

I was going through the following article:

Understanding Collections and Thread Safety in Java

The article says:

You know, Vector and Hashtable are the two collections exist early in Java history, and they are designed for thread-safe from the start (if you have chance to look at their source code, you will see their methods are all synchronized!). However, they quickly expose poor performance in multi-threaded programs. As you may know, synchronization requires locks which always take time to monitor, and that reduces the performance.

[I've also done a benchmark using Caliper; please hear me out on this]

A sample code has also been provided:

public class CollectionsThreadSafeTest {

    public void testVector() {
        long startTime = System.currentTimeMillis();

        Vector<Integer> vector = new Vector<>();
        for (int i = 0; i < 10_000_000; i++) {
            vector.addElement(i);
        }

        long endTime = System.currentTimeMillis();
        long totalTime = endTime - startTime;
        System.out.println("Test Vector: " + totalTime + " ms");
    }

    public void testArrayList() {
        long startTime = System.currentTimeMillis();

        List<Integer> list = new ArrayList<>();
        for (int i = 0; i < 10_000_000; i++) {
            list.add(i);
        }

        long endTime = System.currentTimeMillis();
        long totalTime = endTime - startTime;
        System.out.println("Test ArrayList: " + totalTime + " ms");
    }

    public static void main(String[] args) {
        CollectionsThreadSafeTest tester = new CollectionsThreadSafeTest();

        tester.testVector();
        tester.testArrayList();
    }
}

The output they have provided for the above code is as follows:

Test Vector: 9266 ms
Test ArrayList: 4588 ms

But when I ran it in my machine, it gave me the following result:

Test Vector: 521 ms
Test ArrayList: 2273 ms

I found this to be quite odd. I thought doing a micro benchmark would be better. So, I wrote a benchmark for the above using caliper. The code is as follows:

public class CollectionsThreadSafeTest extends SimpleBenchmark {

    public static final int ELEMENTS = 10_000_000;

    public void timeVector(int reps) {
        for (int i = 0; i < reps; i++) {
            Vector<Integer> vector = new Vector<>();
            for (int k = 0; k < ELEMENTS; k++) {
                vector.addElement(k);
            }
        }
    }

    public void timeArrayList(int reps) {
        for (int i = 0; i < reps; i++) {
            List<Integer> list = new ArrayList<>();
            for (int k = 0; k < ELEMENTS; k++) {
                list.add(k);
            }
        }
    }

    public static void main(String[] args) {
        String[] classesToTest = { CollectionsThreadSafeTest.class.getName() };
        Runner.main(classesToTest);
    }
}

But I got a similar result:

 0% Scenario{vm=java, trial=0, benchmark=ArrayList} 111684174.60 ns; ?=18060504.25 ns @ 10 trials
50% Scenario{vm=java, trial=0, benchmark=Vector} 67701359.18 ns; ?=17924728.23 ns @ 10 trials

benchmark    ms linear runtime
ArrayList 111.7 ==============================
   Vector  67.7 ==================

vm: java
trial: 0

I'm kinda confused. What is happening here? Am I doing something wrong here (that would be really embarrassing) ?

If this is the expected behavior, then what is the explanation behind this?


Update #1

After reading @Kayaman's answer, I ran the caliper tests by changing the values of the initial capacities of both the Vector and the ArrayList. Following are the timings (in ms):

Initial Capacity    Vector  ArrayList
-------------------------------------
10_000_000          49.2    67.1
10_000_001          48.9    71.2
10_000_010          48.1    61.2
10_000_100          43.9    70.1
10_001_000          45.6    70.6
10_010_000          44.8    68.0
10_100_000          52.8    64.6
11_000_000          52.7    71.7
20_000_000          74.0    51.8
-------------------------------------

Thanks for all the inputs :)

Upvotes: 2

Views: 119

Answers (3)

Kayaman
Kayaman

Reputation: 73568

You're not really testing the add() method here. You're testing the different ways that a Vector and an ArrayList grow. A Vector doubles in size when it's full, but an ArrayList has some more refined logic to prevent the internal array from growing exponentially and wasting memory.

If you run your test with a > 10000000 initial capacity for both classes, they won't need to resize and you'll be profiling just the adding part.

Upvotes: 3

Jay Smith
Jay Smith

Reputation: 2490

Both ArrayList and Vector have the same add method:

ensureCapacity();
    elementData[elementCount++] = newElement;

The difference is only one. Vector's add method is synchronized and ArrayList's is not. From theory synchronized methods are slower than non-synchronized.

To improve performance of add method you have to specify initialCapacity in constructor or call method ensureCapacity. This creates internal array as long as you need and so no need to recreate it.

Upvotes: -1

strash
strash

Reputation: 1321

The vector is expected to be slower in multithreaded environment. It is expected to be lightweight in your case. Better do the tests with adding these items from 10000 different threads

Upvotes: 0

Related Questions