Reputation: 553
I'm learning the well known sorting algorithms and implementing them myself. I have recently done merge sort and the code I have is:
def merge(l, r, direction):
# print('merging')
# print(l, r)
# adding infinity to end of list so we know when we've hit the bottom of one pile
l.append(inf)
r.append(inf)
A = []
i, j = 0, 0
while (i < len(l)) and (j < len(r)):
if l[i] <= r[j]:
A.append(l[i])
i += 1
else:
A.append(r[j])
j += 1
# removing infinity from end of list
A.pop()
return(A)
def merge_sort(num_lst, direction='increasing', level=0):
if len(num_lst) > 1:
mid = len(num_lst)//2
l = num_lst[:mid]
r = num_lst[mid:]
l = merge_sort(l, level=level + 1)
r = merge_sort(r, level=level + 1)
num_lst = merge(l, r, direction)
return num_lst
What I have seen in other implementations that differs from my own is in the merging of lists. Where I just create a blank list and append elements in numerical order other pass the preexisting into merge and overwrite each element to create a list in numerical order. Something like:
def merge(arr, l, m, r):
n1 = m - l + 1
n2 = r- m
# create temp arrays
L = [0] * (n1)
R = [0] * (n2)
# Copy data to temp arrays L[] and R[]
for i in range(0 , n1):
L[i] = arr[l + i]
for j in range(0 , n2):
R[j] = arr[m + 1 + j]
# Merge the temp arrays back into arr[l..r]
i = 0 # Initial index of first subarray
j = 0 # Initial index of second subarray
k = l # Initial index of merged subarray
while i < n1 and j < n2 :
if L[i] <= R[j]:
arr[k] = L[i]
i += 1
else:
arr[k] = R[j]
j += 1
k += 1
# Copy the remaining elements of L[], if there
# are any
while i < n1:
arr[k] = L[i]
i += 1
k += 1
# Copy the remaining elements of R[], if there
# are any
while j < n2:
arr[k] = R[j]
j += 1
k += 1
I'm curious about the following:
Is using a append()
on a blank list a bad idea? By my understanding when python creates a list it grabs a certain size chunk of memory and if our list grows beyond that copies the list to a different and larger section of memory (which seems it would be pretty high cost if it happened even once for a large list let alone repeatedly).
Is there just a higher cost to using append()
compared to accessing a list by index? It would seemed to me append would be able to do things at a fairly low cost.
Upvotes: 2
Views: 120
Reputation: 36249
When you instantiate a list Python will allocate the memory necessary for holding the items plus some extra memory for future appends / extends. When you put too much extra stuff in the list it eventually needs to be reallocated which potentially slows down the program (see this part of the source code). The reallocation happens here and the size is calculated as:
new_allocated = (size_t)newsize + (newsize >> 3) + (newsize < 9 ? 3 : 6);
As the comment states the growth pattern is: 0, 4, 8, 16, 25, 35, 46, 58, 72, 88, ...
.
So if the size of the final list is known in advance and allocated from the beginning this will save you some extra reallocations (and thus compute time).
Upvotes: 3
Reputation: 1150
from time import time
append_arr = []
index_arr = [None] * 10240*1024
t0 = time()
for x in range(10240*1024):
append_arr.append(x)
t1 = time()
t2 = time()
for i in range(10240*1024):
index_arr[i] = i
t3 = time()
print(str(t1-t0))
print(str(t3-t2))
Looks like appending is slower.
Upvotes: 0