Reputation:
Consider the following Java code (complete, compiles and runs fine).
The code creates an array containing 5,000,000 integers (1 to 5 million), loops over it, and creates an ArrayList of the perfect squares it finds. Perfect squares are detected using a naive technique, and not bit manipulation, but that is not the focus of the issue at hand.
Mathematically, between 1 and 5M, there are 2236 perfect squares. So the ArrayList that the perfect squares are being put into will have a final size of 2236.
import java.util.ArrayList;
public class PerfSquares {
public static ArrayList<Integer> perfectSquares(int[] arr) {
ArrayList<Integer> al = new ArrayList<Integer>();
// ArrayList<Integer> al = new ArrayList<Integer>(arr.length);
for (int i = 0; i < arr.length; i++) {
double root = Math.sqrt(arr[i]);
int irt = (int) Math.floor(root);
if (irt * irt == arr[i]) {
al.add(arr[i]);
}
}
return al;
}
public static void main(String[] args) {
int[] arr = new int[5000000];
for (int i = 0; i < arr.length; i++) {
arr[i] = i + 1;
}
long s = System.currentTimeMillis();
perfectSquares(arr);
long e = System.currentTimeMillis();
System.out.println(e - s);
}
}
I'd like to focus on the declaration of the ArrayList. These two lines, one of which is commented out:
ArrayList<Integer> al = new ArrayList<Integer>();
//ArrayList<Integer> al = new ArrayList<Integer>(arr.length);
When I run with the first declaration (without the size explicitly supplied), the timediff I see is:
~96 milliseconds.
When I run with the second declaration (with the size explicitly supplied), the timediff increases to:
~105 milliseconds
Question:
Why is this behavior so? Shouldn't the second case (size supplied) be faster?
As per my understanding, in the first case, when we omit the size parameter to the ArrayList creation, behind the scenes an array of length 10 will be initialized. And when this capacity is exceeded, a new array with a larger capacity (not sure how much larger) will be allocated, and the previous elements will be copied over.
For 2236 elements and no initial size being specified, this "cap exceeded - allocate new - copy over - append more till cap" cycle should repeat many times, slowing it down.
Consequently, I was expecting the size supplied declaration to be faster - Since the allocation will happen once, and there will be no capacity exceeding / new array creation and copying over happening.
Or is this basically so because 2236 appends to an ArrayList, even with all the cap-exceeds-copy-over cycles, will still be faster than creating an ArrayList of size 5,000,000?
Upvotes: 8
Views: 6418
Reputation: 16526
Because memory allocation is generally a slow operation. I counted the number of allocations and new element for both cases.
In this case
ArrayList<Integer> al = new ArrayList<Integer>();
You have only 15 allocations for 8317 elements in total. And in this case
ArrayList<Integer> al = new ArrayList<Integer>(arr.length);
you have a single allocation for 5000000 elements.
Upvotes: 1
Reputation: 7423
When you call add()
on an ArrayList
which is already full, it grows automatically by 50 %. So it will be quickly big enough and there will not be so many memory allocations.
Upvotes: 0
Reputation: 22291
First of all, your method of measurement is flawed. Under these circumstances, however, measurement is not easy, because there for such large array allocation each following new may be slower.
As for your problem: memory allocation (and even deallocation) is an expensive operation. Not when you use new
- nowadays vms are quite optimized for many small short-lived objects - but mostly whenever the JVM has to reserve/allocate (aka malloc()
) memory on the lower, system level. Moreover, the memory allocation time also depends on the amount of memory allocated - the more you need, the longer it will take.
In your case: the amount of perfect squares is AFAIR easy to compute. Just use (Math.sqrt(arr.length) + 1)
to determine the initial ArrayList
size and get the size exactly right immediately.
Upvotes: 2
Reputation: 650
You're creating an arraylist of 5 million, so clearly it's slower. You only need 2236. That's alot of waste.
If you change the size of your array list to 10k for example you'll see the time difference shrink.
To make it simpler, just try this test, multiple times, in different orders -
public static void main(String[] args) {
long timea = System.currentTimeMillis();
// ArrayList<Integer> al = new ArrayList<Integer>();
ArrayList<Integer> al = new ArrayList<Integer>(5000000);
System.out.println(System.currentTimeMillis() - timea);
}
You'll see that initialising an arraylist to 5 million takes around 10ms (on my macbook), while the one with no default size is pretty much instantaneous. This is the same no mater which order you test.
Upvotes: 6