Reputation: 4131
I was going through the following article:
Understanding Collections and Thread Safety in Java
The article says:
You know, Vector and Hashtable are the two collections exist early in Java history, and they are designed for thread-safe from the start (if you have chance to look at their source code, you will see their methods are all synchronized!). However, they quickly expose poor performance in multi-threaded programs. As you may know, synchronization requires locks which always take time to monitor, and that reduces the performance.
[I've also done a benchmark using Caliper; please hear me out on this]
A sample code has also been provided:
public class CollectionsThreadSafeTest {
public void testVector() {
long startTime = System.currentTimeMillis();
Vector<Integer> vector = new Vector<>();
for (int i = 0; i < 10_000_000; i++) {
vector.addElement(i);
}
long endTime = System.currentTimeMillis();
long totalTime = endTime - startTime;
System.out.println("Test Vector: " + totalTime + " ms");
}
public void testArrayList() {
long startTime = System.currentTimeMillis();
List<Integer> list = new ArrayList<>();
for (int i = 0; i < 10_000_000; i++) {
list.add(i);
}
long endTime = System.currentTimeMillis();
long totalTime = endTime - startTime;
System.out.println("Test ArrayList: " + totalTime + " ms");
}
public static void main(String[] args) {
CollectionsThreadSafeTest tester = new CollectionsThreadSafeTest();
tester.testVector();
tester.testArrayList();
}
}
The output they have provided for the above code is as follows:
Test Vector: 9266 ms
Test ArrayList: 4588 ms
But when I ran it in my machine, it gave me the following result:
Test Vector: 521 ms
Test ArrayList: 2273 ms
I found this to be quite odd. I thought doing a micro benchmark would be better. So, I wrote a benchmark for the above using caliper. The code is as follows:
public class CollectionsThreadSafeTest extends SimpleBenchmark {
public static final int ELEMENTS = 10_000_000;
public void timeVector(int reps) {
for (int i = 0; i < reps; i++) {
Vector<Integer> vector = new Vector<>();
for (int k = 0; k < ELEMENTS; k++) {
vector.addElement(k);
}
}
}
public void timeArrayList(int reps) {
for (int i = 0; i < reps; i++) {
List<Integer> list = new ArrayList<>();
for (int k = 0; k < ELEMENTS; k++) {
list.add(k);
}
}
}
public static void main(String[] args) {
String[] classesToTest = { CollectionsThreadSafeTest.class.getName() };
Runner.main(classesToTest);
}
}
But I got a similar result:
0% Scenario{vm=java, trial=0, benchmark=ArrayList} 111684174.60 ns; ?=18060504.25 ns @ 10 trials
50% Scenario{vm=java, trial=0, benchmark=Vector} 67701359.18 ns; ?=17924728.23 ns @ 10 trials
benchmark ms linear runtime
ArrayList 111.7 ==============================
Vector 67.7 ==================
vm: java
trial: 0
I'm kinda confused. What is happening here? Am I doing something wrong here (that would be really embarrassing) ?
If this is the expected behavior, then what is the explanation behind this?
Update #1
After reading @Kayaman's answer, I ran the caliper tests by changing the values of the initial capacities of both the Vector
and the ArrayList
. Following are the timings (in ms):
Initial Capacity Vector ArrayList
-------------------------------------
10_000_000 49.2 67.1
10_000_001 48.9 71.2
10_000_010 48.1 61.2
10_000_100 43.9 70.1
10_001_000 45.6 70.6
10_010_000 44.8 68.0
10_100_000 52.8 64.6
11_000_000 52.7 71.7
20_000_000 74.0 51.8
-------------------------------------
Thanks for all the inputs :)
Upvotes: 2
Views: 119
Reputation: 73568
You're not really testing the add()
method here. You're testing the different ways that a Vector
and an ArrayList
grow. A Vector
doubles in size when it's full, but an ArrayList
has some more refined logic to prevent the internal array from growing exponentially and wasting memory.
If you run your test with a > 10000000 initial capacity for both classes, they won't need to resize and you'll be profiling just the adding part.
Upvotes: 3
Reputation: 2490
Both ArrayList and Vector have the same add method:
ensureCapacity();
elementData[elementCount++] = newElement;
The difference is only one. Vector's add method is synchronized and ArrayList's is not. From theory synchronized methods are slower than non-synchronized.
To improve performance of add method you have to specify initialCapacity
in constructor or call method ensureCapacity
. This creates internal array as long as you need and so no need to recreate it.
Upvotes: -1
Reputation: 1321
The vector is expected to be slower in multithreaded environment. It is expected to be lightweight in your case. Better do the tests with adding these items from 10000 different threads
Upvotes: 0