Meena Chaudhary
Meena Chaudhary

Reputation: 10665

Performance of LinkedList vs ArrayList in maintaining an ordered list

I want to maintain an ordered List<Integer> of size <= 10^6. Every time a new element will be added I will call Collections.sort() method to sort the new element in the list. As per my knowledge ArrayList is better performing than LinkedList. But since I will be calling sort() method quite often, I have come to understanding that linkedList will perform better when sorting the list and will be a better choice over ArrayList, since there is no shifting of elements as in case of ArrayList(uses array as underlying data structure). Any suggestions which will be more efficient.

Upvotes: 14

Views: 4513

Answers (5)

smac89
smac89

Reputation: 43068

Since java does not have built in multiset, which is the perfect data structure for your situation, I will suggest using the TreeMultiset found in the guava library.

Multisets allow for duplicate elements, and a tree multiset will also add the benefit of keeping your collection sorted.

Upvotes: 6

Kayaman
Kayaman

Reputation: 73528

Calling sort() on a LinkedList is devastating on the performance, due to the default implementation of List.sort() converting the List to an array for sorting. There are very few cases in which it makes sense to use a LinkedList, even though it may seem like it should be effective.

If you wish to have the collection always sorted, you really should use an ordered collection like a TreeSet or maybe even a PriorityQueue. It will provide cleaner code (as well as faster sorting), since you don't have to worry about calling sort() yourself all the time.

Upvotes: 2

assylias
assylias

Reputation: 328568

You could use Collections#binarySearch on the sorted list to find the correct insertion point. ArrayList would probably perform better than a LinkedList, especially for larg-ish sizes, but that is easy to test.

I ran a micro benchmark of various methods: using a sort after each insertion or a binarySearch to insert in the right place, both with ArrayList (AL) and LinkedList (LL). I also added Commons TreeList and guava's TreeMultiset.

Conclusions

  • the best algo among those tested is using TreeMultiset, but it is not a list strictly speaking - the next best option is to use an ArrayList + binarySearch
  • ArrayList performs better than LinkedList in all situations and the latter took several minutes to complete with 100,000 elements (ArrayList took less than one second).

Code of the best performer, for reference:

@Benchmark public ArrayList<Integer> binarySearchAL() {
  ArrayList<Integer> list = new ArrayList<> ();

  Random r = new Random();
  for (int i = 0; i < n; i++) {
    int num = r.nextInt();
    int index = Collections.binarySearch(list, num);
    if (index >= 0) list.add(index, num);
    else list.add(-index - 1, num);
    current = list.get(0); //O(1), to make sure the sort is not optimised away
  }
  return list;
}

Full code on bitbucket.

Full results

The "Benchmark" column contains the name of the method under test (baseLine just fills a list without sorting it, the other methods have explicit names: AL=ArrayList, LL=LinkedList,TL=Commons TreeList,treeMultiSet=guava), (n) is the size of the list, Score is the time taken in milliseconds.

Benchmark                            (n)  Mode  Samples     Score     Error  Units
c.a.p.SO28164665.baseLine            100  avgt       10     0.002 ±   0.000  ms/op
c.a.p.SO28164665.baseLine           1000  avgt       10     0.017 ±   0.001  ms/op
c.a.p.SO28164665.baseLine           5000  avgt       10     0.086 ±   0.002  ms/op
c.a.p.SO28164665.baseLine          10000  avgt       10     0.175 ±   0.007  ms/op
c.a.p.SO28164665.binarySearchAL      100  avgt       10     0.014 ±   0.001  ms/op
c.a.p.SO28164665.binarySearchAL     1000  avgt       10     0.226 ±   0.006  ms/op
c.a.p.SO28164665.binarySearchAL     5000  avgt       10     2.413 ±   0.125  ms/op
c.a.p.SO28164665.binarySearchAL    10000  avgt       10     8.478 ±   0.523  ms/op
c.a.p.SO28164665.binarySearchLL      100  avgt       10     0.031 ±   0.000  ms/op
c.a.p.SO28164665.binarySearchLL     1000  avgt       10     3.876 ±   0.100  ms/op
c.a.p.SO28164665.binarySearchLL     5000  avgt       10   263.717 ±   6.852  ms/op
c.a.p.SO28164665.binarySearchLL    10000  avgt       10   843.436 ±  33.265  ms/op
c.a.p.SO28164665.sortAL              100  avgt       10     0.051 ±   0.002  ms/op
c.a.p.SO28164665.sortAL             1000  avgt       10     3.381 ±   0.189  ms/op
c.a.p.SO28164665.sortAL             5000  avgt       10   118.882 ±  22.030  ms/op
c.a.p.SO28164665.sortAL            10000  avgt       10   511.668 ± 171.453  ms/op
c.a.p.SO28164665.sortLL              100  avgt       10     0.082 ±   0.002  ms/op
c.a.p.SO28164665.sortLL             1000  avgt       10    13.045 ±   0.460  ms/op
c.a.p.SO28164665.sortLL             5000  avgt       10   642.593 ± 188.044  ms/op
c.a.p.SO28164665.sortLL            10000  avgt       10  1182.698 ± 159.468  ms/op
c.a.p.SO28164665.binarySearchTL      100  avgt       10    0.056 ±  0.002  ms/op
c.a.p.SO28164665.binarySearchTL     1000  avgt       10    1.083 ±  0.052  ms/op
c.a.p.SO28164665.binarySearchTL     5000  avgt       10    8.246 ±  0.329  ms/op
c.a.p.SO28164665.binarySearchTL    10000  avgt       10  735.192 ± 56.071  ms/op
c.a.p.SO28164665.treeMultiSet        100  avgt       10    0.021 ±  0.001  ms/op
c.a.p.SO28164665.treeMultiSet       1000  avgt       10    0.288 ±  0.008  ms/op
c.a.p.SO28164665.treeMultiSet       5000  avgt       10    1.809 ±  0.061  ms/op
c.a.p.SO28164665.treeMultiSet      10000  avgt       10    4.283 ±  0.214  ms/op

For 100k items:

c.a.p.SO28164665.binarySearchAL    100000  avgt        6  890.585 ± 68.730  ms/op
c.a.p.SO28164665.treeMultiSet      100000  avgt        6  105.273 ±  9.309  ms/op

Upvotes: 18

James_pic
James_pic

Reputation: 3307

Under Oracle Java / OpenJDK 7 or higher, the asymptotic performance of both would be similar. Collections.sort loads the list into an array, sorts the array, and loads the array back into the list by iterating through it (using a ListIterator), replacing its elements.

In both cases, this is an array sort on a mostly-sorted array (which is O(n) in OpenJDK 7 and higher, since it uses timsort), plus two list iterations (which are O(n) in both cases - although I'd expect LinkedList to have a worse constant term). So overall, it's an O(n) process, but likely to be slower for LinkedList.

If you're bulk inserting elements, the bulk insert will be O(n^2) overall, which is slower than inserting them all and sorting, or following Smac89's suggestion of using a TreeMultiset (both would be O(n log(n))).

And just for fun, here's a truly awful way to abuse TreeSet to allow it to store duplicate elements:

public class AwfulComparator<E extends Comparable<E>> implements Comparator<E> {
    public int compare(E o1, E o2) {
        int compared = o1.compareTo(o2);
        return (compared == 0)?1:compared; // Never compare equal
    }
}

new TreeSet<String>(new AwfulComparator<>());

Upvotes: 1

OliverS
OliverS

Reputation: 1271

You should consider to use data structures that are designed to maintain order if the sorting is your main performance consideration.

Using the normal Java base classes you could use either of these:

PriorityQueue (in case you want to retain duplicates)
TreeSet (filter duplicates)

In any case it will be easiest to just prototype all versions and run some benchmarks + profiling.

Upvotes: 1

Related Questions