Reputation: 3955
For example, ArrayList
s in Java have a resizing factor of 2. When the array that the ArrayList
is wrapped around runs out of space, all the elements of that array are transferred to a new array that is 2 times the size of the original array.
Since Python lists/arrays are naturally dynamic, what are their resizing factor? Or do they use some other method of scaling? If so, what's that method? What's its asymptotic runtime (big O)?
Upvotes: 12
Views: 5551
Reputation: 59368
To answer your other question: adding an item to the end of an ArrayList
or similar takes amortized O(1) time as long as there is any factor k > 1 such that reallocation always makes an array that is k times as big as the previous one.
The actual factor used doesn't really matter as long as it's bigger than one. Alternative reallocation schemes might have a different complexity. If reallocation adds a constant amount, for example, then adding an item takes amortized O(N) time.
That's why all professionally written dynamically growing arrays grow by some factor each time. The factor provides a fixed upper bound on the proportion of wasted space ((k-1)/k), traded off against a fixed upper bound on the average number of times each item could be copied in a sequence of adds.
Lets say you're using factor k and you've just performed the Nth reallocation. That means you have added about k^N items. How many items have you copied? Let the be C. Then:
C = k^(N-1) + k(N-2) + ... + 1
kC - C = k^N + k^(N-1) - k^(N-1) + k^(N-2) - k^(N-2) ... - 1
kC - C = k^N - 1
C = (k^N-1) / (k-1)
The number of copies per item added, then is C/K^N, which is pretty much 1/(k-1).
So Java's factor of k=2 means there is about one copy per add operation, with up to 50% unused slots, and CPython's factor of k=1.125 means there are about 8 copies per add operation with up to 11% unused slots.
Upvotes: 2
Reputation: 41311
For CPython, in particular, the overallocation during resize is implemented in objects/listobject.c. It is described as
mild, but ... enough to give linear-time amortized behavior over a long sequence of appends() in the presence of a poorly-performing system realloc().
When the list needs to be grown, it is grown to be approximately 112.5% of the necessary size. In particular, the actual size is new_size + new_size >> 3 + (newsize < 9 ? 3 : 6)
.
Upvotes: 10