Reputation: 20895
I have written a recursive version of merge sort. It makes use of a separate merge
routine:
def merge(lst1, lst2):
i = j = 0
merged = []
while i < len(lst1) and j < len(lst2):
if lst1[i] <= lst2[j]:
merged.append(lst1[i])
i += 1
else:
merged.append(lst2[j])
j += 1
merged.extend(lst1[i:])
merged.extend(lst2[j:])
return merged
def merge_sort(lst):
if len(lst) < 2:
return lst
else:
middle = len(lst) / 2
return merge(merge_sort(lst[:middle]), merge_sort(lst[middle:]))
To conserve stack space (and for kicks/the sheer joy of learning algorithms), I am trying to write this function in an iterative manner. However, I find this difficult because I am not sure how to combine disparate lists in the very end.
Thank you!
Upvotes: 7
Views: 7153
Reputation: 170
Recursion is more intuitive hence I prefer the same except in some situations when I want to avoid a significant stack depth (e.g while consuming certain co-routine implementations). In case of Merge sort however the iterative version is actually easier to follow (at least the pseudo code).
All that's needed is a nested loop with the inner loop performing merges on pairs of 2^k elements with the outer loop responsible for incrementing k.
An additional step that is required is to merge any unpaired group with the previous merged group. An unpaired group will be encountered if the number of elements is not a power of 2. An unpaired group will always be at the end of the iteration.
e.g. [5, 7, 3, 4, 1, 9] -> [5, 7] [3, 4] [1, 9] -> [3, 4, 5, 7] [1, 9] -> [1, 3, 4, 5, 7, 9]
In the above example [1, 9] is a group that did not have another group to be merged with initially. Thus it was merged with the previous group (which had been merged and sorted already)
Here is a python implementation:
from MergeSort import merge
def sort(arr):
n = len(arr) - 1
c = 1
start = 0
mid = 0
end = 0
while c <= n:
while end < n:
mid = start + c//2
end = start + c
if (start < n) and (end <= n):
merge(arr, start, mid, end)
start = end + 1
else:
merge(arr, start - c - 1, start-1, n)
c = 2*c + 1
start = 0
mid = 0
end = 0
I used the merge function from the regular (recursive) version. While the above code is not the most elegant, but it works and has the same complexity as the recursive version. (I have not checked thoroughly but it seems so to me from a quick glance)
Here is a unit test:
def test_merge_sort_iterative(self):
for i in range(1, 100):
length = randint(10, 5000)
data = [randint(1, 10000) for x in range(1, length)]
IterativeMergeSort.sort(data)
issorted = True
i = 0
while (i < len(data) - 1) & issorted:
if data[i] > data[i + 1]:
issorted = False
i += 1
self.assertTrue(issorted, data)
return
Upvotes: 0
Reputation: 2943
Here is Java Implementation
public static <T extends Comparable<? super T>> void iterativeMergeSort(T[] seed) {
for (int i = 1; i <seed.length; i=i+i)
{
for (int j = 0; j < seed.length - i; j = j + i+i)
{
inPlaceMerge(seed, j, j + i-1, Math.min(j+i + i -1, seed.length -1));
}
}
}
public static <T extends Comparable<? super T>> void inPlaceMerge(T[] collection, int low, int mid, int high) {
int left = low;
int right = mid + 1;
if(collection[mid].equals(collection[right])) {
return ;//Skip the merge if required
}
while (left <= mid && right <= high) {
// Select from left: no change, just advance left
if (collection[left].compareTo(collection[right]) <= 0) {
left ++;
} else { // Select from right: rotate [left..right] and correct
T tmp = collection[right]; // Will move to [left]
rotateRight(collection, left, right - left);
collection[left] = tmp;
// EVERYTHING has moved up by one
left ++; right ++; mid ++;
}
}
}
Here is the unit test
private Integer[] seed;
@Before
public void doBeforeEachTestCase() {
this.seed = new Integer[]{4,2,3,1,5,8,7,6};
}
@Test
public void iterativeMergeSortFirstTest() {
ArrayUtils.<Integer>iterativeMergeSort(seed);
Integer[] result = new Integer[]{1,2,3,4,5,6,7,8};
assertThat(seed, equalTo(result));
}
Upvotes: 1
Reputation: 71
I expanded based on Divya's description (added a test function for verification as well). The below code may be optimized by eliminating sub-arrays(data_1 and data_2) and sorting in place.
def merge_sort_iterative(data):
""" gets the data using merge sort and returns sorted."""
for j in range(1, len(data)):
j *= 2
for i in range(0,len(data),j):
data_1 = data[i:i+(j/2)]
data_2 = data[i+(j/2):j-i]
l = m = 0
while l < len(data_1) and m < len(data_2):
if data_1[l] < data_2[m]:
m += 1
elif data_1[l] > data_2[m]:
data_1[l], data_2[m] = data_2[m], data_1[l]
l += 1
data[i:i+(j/2)], data[i+(j/2):j-i] = data_1, data_2
return data
def test_merge_sort():
"""test function for verifying algorithm correctness"""
import random
import time
sample_size = 5000
sample_data = random.sample(range(sample_size*5), sample_size)
print 'Sample size: ', sample_size
begin = time.time()
sample_data = [5,4,3,2,1]
result = merge_sort_iterative(sample_data)
end = time.time()
expected = sorted(sample_data)
print 'Sorting time: %f \'secs'%(end-begin)
assert result == expected, 'Algorithm failed'
print 'Algorithm correct'
if __name__ == '__main__':
test_merge_sort()
Upvotes: 2
Reputation: 2622
You will need a merge
function (the same or almost same merge
function) which will be called repeatedly. So, you don't need to change the merge
function.
This is a multiple pass solution. Start with a chunk size of 2 and double the chunk size in every pass.
In every pass, partition the list into non-overlapping chunks of size whatever. Split every chunk into 2 and call merge
on the 2 parts.
This is a bottom up version.
Upvotes: 4