Reputation: 378
As far as I remember ArrayList should be faster on gets and slower on add and remove. Given that - could you please give me a hint what is wrong with this code if it produces completely different results: remove time is almost the same for ArrayList and LinkedList and addition is faster for ArrayList...
import java.util.ArrayList;
import java.util.LinkedList;
public class ListsTest {
public static void main(String[] args) {
ArrayList arrayList = new ArrayList();
LinkedList linkedList = new LinkedList();
// ArrayList add
long startTime = System.nanoTime();
for (int i = 0; i < 100000; i++) {
arrayList.add(i);
}
long endTime = System.nanoTime();
long duration = endTime - startTime;
System.out.println(" ArrayList add: " + duration);
// LinkedList add
startTime = System.nanoTime();
for (int i = 0; i < 100000; i++) {
linkedList.add(i);
}
endTime = System.nanoTime();
duration = endTime - startTime;
System.out.println("LinkedList add: " + duration);
// ArrayList get
startTime = System.nanoTime();
for (int i = 0; i < 10000; i++) {
arrayList.get(i);
}
endTime = System.nanoTime();
duration = endTime - startTime;
System.out.println(" ArrayList get: " + duration);
// LinkedList get
startTime = System.nanoTime();
for (int i = 0; i < 10000; i++) {
linkedList.get(i);
}
endTime = System.nanoTime();
duration = endTime - startTime;
System.out.println("LinkedList get: " + duration);
// ArrayList remove
startTime = System.nanoTime();
for (int i = 9999; i >=0; i--) {
arrayList.remove(i);
}
endTime = System.nanoTime();
duration = endTime - startTime;
System.out.println(" ArrayList remove: " + duration);
// LinkedList remove
startTime = System.nanoTime();
for (int i = 9999; i >=0; i--) {
linkedList.remove(i);
}
endTime = System.nanoTime();
duration = endTime - startTime;
System.out.println("LinkedList remove: " + duration);
}
}
My results:
ArrayList add: 13540332
LinkedList add: 93488785
ArrayList get: 676937
LinkedList get: 335366109
ArrayList remove: 529793354
LinkedList remove: 410813052
Edit: As it was mentioned in a few comments it is important whether we add/remove/get to/from the end of the list or whether we use random index. When using random index all results are "correct":
arrayList.get((int) (Math.random()*arrayList.size()));
The same issue is resolved here: https://coderanch.com/t/669483/certification/ArrayList-LinkedList-speed-comparison
Upvotes: 1
Views: 961
Reputation: 494
Technical answer:
On the top level perspective you code looks pretty well and would naively be used by most of Java
developers to write some (micro) benchmark tests.
In fact (micro) benchmarking in Java
is a really tricky thing to do, since the JVM
is a big black box and also the JIT
is performing a lot of optimizations to your code - even while executing it! - The beauty and the beast of Java
.
For example:
Java
is - or was - considered a quite slow language due to it's use of an Interpreter/ JIT
. These will only read and compile your instructions only just a moment before being actually executed. This in fact is only true for the first ~10.000 executions (default value). When Java
detects a peace of code being executed for more then 10.000 times it performs a deeper analysis on it and "pre-compiles" it. Then on the next execution instead of interpreting it again, the pre-compiled code is used to execute it even faster.
Therefore it got it's name "Hotspot"-JVM
- it'll identify the "hot spots" of your code and optimize it accordingly (so that analysis is being performed dynamically during the code execution and not only upfront with static code analysis). Btw: This threshold can be controled with the -XX:CompileThreshold=
option of the JVM.
A second thing is, that Java
is pretty good in reading your code and finding unused or uneffective instructions. When Java
notices them it'll restructure your code to be more efficient or will even completely remove unnecessary instructions from the code base.
This are only two examples of how Java
and the JVM
optimize the code and there are a lot more! Too many to list all of them here, but there are a lot of great resources out there referring to it and explaining it in details (one short overview is linked here). Also note the used techniques and the success of the optimizations so also differs between the different JVM
vendors (Oracle, IBM, SAP, Red Hat, ...).
Therefore to write the unbiased micro benchmarks in Java
the JMH (Java Microbenchmark Harness) projects exists and should be used for that. It provides you many insights on of JVM
optimizations work and how you can write your tests to measure the impact of them (you have the ability to control some of them with special instructions offered by the JMH
) - or to benchmark your code unbiased of any of the optimizations.
This technical benchmarking also depends on the actual machine your code is executed on.
Everything counts:
The JMH
also contains methods and strategies (like "JVM warmup runs") to reduce this unique load peaks in order to get an unbiased result - but there will always be an impact on the measurement.
Therefore the concrete benchmarks results cannot be compared between 2 different machines - but they'll give a rough idea how performant a peace of code can work (specially when comparing 2 different algorithms on a single machine - like it is in your case).
Theoretical answer:
As you already mentioned there is the Big-O
notation to analyze and evaluate algorithms performance. This will have a look on the structure of an algorithm and evaluate it's performance based on it - neglecting the difficulties descibed in the technical answer above to come to a "qualitative" and not an "absolute" valuation. It'll have a look on the best, average and worst case scenario and then come to a rating based on it.
According to the Big-O
model you've already identified the theoretical odd's and benefits of an ArrayList
over an LinkedList
. But like you also noticed the performance of the data structures (so the implemented algorithms) varies depending on the real data they work with - whether it's reflecting more the best, average or worst case scenario.
In your micro benchmark for example you've reflected more like the best case scenario, in which elements only get added to the end of the lists (as well as being removed from there instead of for example from the middle -> the ArrayList
doesn't need to rearrange it's internal array to avoid gaps therein). If you'd change the code/ benchmark to add the elements to the head of the list (or somewhere in the middle), the performance of the ArrayList
would also drop drastically already.
This best, average and worst case scenarios therefore can be considered to be implemented as separate benchmarks in the JMH
like explained in the technical part of the answer, so that the performance can be compared for each of the scenarios.
But in case you're sure how your data is going to look like and you can specify it very accurately, the benchmarks will be most valuable if they also reflect your real data. In case you cannot specify it, you might implement all the scenarios to get a general overview.
EDIT:
This of course is only the executive point of view on the algorithms - and looks like you're only interested in it.
To complete the answer: I'd only want to mention that there is also the possibility to look on the memory consumption of the algorithms/ data structures. In case it's a requirement you need to consider it might also be worth to take a look on that. For example it also has been discussed here already and gives you a good overview when your interested in it.
Upvotes: 3
Reputation: 144
Arraylist is just an array [x1,x2,x3,x4,null,null,null,null] .. and it must have certain size
well its good practice, if you know how much your size your arraylist will be, you should initialize arraylist with the size so that adding to the end of list would be O(1) operation. much efficient and faster
on get O(1) .. as arrays can reach its element by index directly
on remove from index O(N) .. a new array is built and elements are copied to the new array with ignoring the element you want to remove
Linkedlist is a sequence of nodes, each node has a pointer to the next element and previous element (Double Linkedlist), a node is just a class which have value, pointer to next node, pointer to previous node
to add to specific index O(N) , you will start from the first node and point to the next node and so on, till you find your index, then it will create a node with your value and hook it to the desired index of linkedlist by setting its pointers
to get O(N), you are going to start from the first Node, and loop through nodes till you reach your index
to remove from specific index O(N), as you will loop through nodes till you reach your node, then you will remove this node completely and changing the pointers of its sibling nodes
as you can see, all your operations done on the code are O(N) except the get of arraylist is O(1), this is why all their time are near each other except get Arraylist is low time interval
Upvotes: 0