removeAll() in ArrayList vs HashSet

Question

I found out that for quite big Arrays (over 1000 entries), the methods A.removeAll(B) is way faster on a HashSet than on an ArrayList.

Do you have an idea of how these methods are implemented and how this could explain that difference?

Thomas · Accepted Answer

A set (and thus HashSet as well) contains at most one element of B and since HashSet uses hashes it is quite efficient to locate and remove that element. The overall complexity should thus be O(1) for removing all (that is one) B.

A list can contain any number of B in any location so removing all B has to check all elements. The overall complexity is O(n) since every element has to be checked if it is a B.

Edit:

If B represents a collection/array, i.e. a set of multiple elements, you'd multiply the above complexities by the size m of B, so you'll get O(m) for HashSet and O(n * m) for lists.

Edit 2:

Note that if you have a sorted list the complexity might be reduced to O(log(n)) or O(log(n) * m). For that to work the code removing the actual elements would have to know the list is sorted though and since ArrayList is not guaranteed to be sorted it can't make that optimization.

removeAll() in ArrayList vs HashSet

Answers (2)

Related Questions