Reputation: 4603
I found out that for quite big Arrays (over 1000 entries), the methods A.removeAll(B)
is way faster on a HashSet
than on an ArrayList
.
Do you have an idea of how these methods are implemented and how this could explain that difference?
Upvotes: 1
Views: 3654
Reputation: 7347
Basically the reason for both is the time complexity that these specific implementations are trying to achive for theyr respectiv operations.
The time complexity for the ArrayList
remove method is O(n - index)
source from When to use LinkedList over ArrayList?
While the remove method of the HashSet
offers constant time complexity O(1)
source from Hashset vs Treeset
Upvotes: 1
Reputation: 88707
A set (and thus HashSet
as well) contains at most one element of B
and since HashSet
uses hashes it is quite efficient to locate and remove that element. The overall complexity should thus be O(1)
for removing all (that is one) B
.
A list can contain any number of B
in any location so removing all B
has to check all elements. The overall complexity is O(n)
since every element has to be checked if it is a B
.
Edit:
If B
represents a collection/array, i.e. a set of multiple elements, you'd multiply the above complexities by the size m
of B
, so you'll get O(m)
for HashSet
and O(n * m)
for lists.
Edit 2:
Note that if you have a sorted list the complexity might be reduced to O(log(n))
or O(log(n) * m)
. For that to work the code removing the actual elements would have to know the list is sorted though and since ArrayList
is not guaranteed to be sorted it can't make that optimization.
Upvotes: 7