dangerChihuahua007
dangerChihuahua007

Reputation: 20895

How does one iteratively write merge sort?

I have written a recursive version of merge sort. It makes use of a separate merge routine:

def merge(lst1, lst2):
    i = j = 0
    merged = []
    while i < len(lst1) and j < len(lst2):
        if lst1[i] <= lst2[j]:
            merged.append(lst1[i])
            i += 1
        else:
            merged.append(lst2[j])
            j += 1
    merged.extend(lst1[i:])
    merged.extend(lst2[j:])
    return merged

def merge_sort(lst):
    if len(lst) < 2:
        return lst
    else:
        middle = len(lst) / 2
        return merge(merge_sort(lst[:middle]), merge_sort(lst[middle:]))

To conserve stack space (and for kicks/the sheer joy of learning algorithms), I am trying to write this function in an iterative manner. However, I find this difficult because I am not sure how to combine disparate lists in the very end.

Thank you!

Upvotes: 7

Views: 7153

Answers (4)

lalatnayak
lalatnayak

Reputation: 170

Recursion is more intuitive hence I prefer the same except in some situations when I want to avoid a significant stack depth (e.g while consuming certain co-routine implementations). In case of Merge sort however the iterative version is actually easier to follow (at least the pseudo code).

All that's needed is a nested loop with the inner loop performing merges on pairs of 2^k elements with the outer loop responsible for incrementing k.

An additional step that is required is to merge any unpaired group with the previous merged group. An unpaired group will be encountered if the number of elements is not a power of 2. An unpaired group will always be at the end of the iteration.

e.g. [5, 7, 3, 4, 1, 9] -> [5, 7] [3, 4] [1, 9] -> [3, 4, 5, 7] [1, 9] -> [1, 3, 4, 5, 7, 9]

In the above example [1, 9] is a group that did not have another group to be merged with initially. Thus it was merged with the previous group (which had been merged and sorted already)

Here is a python implementation:

from MergeSort import merge

def sort(arr):
    n = len(arr) - 1
    c = 1
    start = 0
    mid = 0
    end = 0
    while c <= n:
        while end < n:
            mid = start + c//2
            end = start + c
            if (start < n) and (end <= n):
                merge(arr, start, mid, end)
                start = end + 1
            else:
                merge(arr, start - c - 1, start-1, n)
        c = 2*c + 1
        start = 0
        mid = 0
        end = 0

I used the merge function from the regular (recursive) version. While the above code is not the most elegant, but it works and has the same complexity as the recursive version. (I have not checked thoroughly but it seems so to me from a quick glance)

Here is a unit test:

def test_merge_sort_iterative(self):
    for i in range(1, 100):
        length = randint(10, 5000)
        data = [randint(1, 10000) for x in range(1, length)]
        IterativeMergeSort.sort(data)
        issorted = True
        i = 0
        while (i < len(data) - 1) & issorted:
            if data[i] > data[i + 1]:
                issorted = False
            i += 1
    self.assertTrue(issorted, data)
    return

Upvotes: 0

craftsmannadeem
craftsmannadeem

Reputation: 2943

Here is Java Implementation

public static <T extends Comparable<? super T>> void iterativeMergeSort(T[] seed) {

    for (int i = 1; i <seed.length; i=i+i)
    {
        for (int j = 0; j < seed.length - i; j = j + i+i)
        {
            inPlaceMerge(seed, j, j + i-1, Math.min(j+i + i -1, seed.length -1));
        }
    }       
}

public static <T extends Comparable<? super T>>  void inPlaceMerge(T[] collection, int low, int mid, int high) {
    int left = low;
    int right = mid + 1;

    if(collection[mid].equals(collection[right])) {
        return ;//Skip the merge if required
    }
    while (left <= mid && right <= high) {          
        // Select from left:  no change, just advance left
        if (collection[left].compareTo(collection[right]) <= 0) {
            left ++;
        } else { // Select from right:  rotate [left..right] and correct
            T tmp = collection[right]; // Will move to [left]
            rotateRight(collection, left, right - left);
            collection[left] = tmp;
            // EVERYTHING has moved up by one
            left ++; right ++; mid ++;
        }
    }       
}

Here is the unit test

private Integer[] seed;

@Before
public void doBeforeEachTestCase() {
    this.seed = new Integer[]{4,2,3,1,5,8,7,6};
}
@Test
public void iterativeMergeSortFirstTest() {
    ArrayUtils.<Integer>iterativeMergeSort(seed);
    Integer[] result = new Integer[]{1,2,3,4,5,6,7,8};
    assertThat(seed, equalTo(result));  
}

Upvotes: 1

Gluamice
Gluamice

Reputation: 71

I expanded based on Divya's description (added a test function for verification as well). The below code may be optimized by eliminating sub-arrays(data_1 and data_2) and sorting in place.

def merge_sort_iterative(data):
  """ gets the data using merge sort and returns sorted."""

  for j in range(1, len(data)):
    j *= 2
    for i in range(0,len(data),j):
      data_1 = data[i:i+(j/2)]
      data_2 = data[i+(j/2):j-i]
      l = m = 0
      while l < len(data_1) and m < len(data_2):
        if data_1[l] < data_2[m]:
          m += 1
        elif data_1[l] > data_2[m]:
          data_1[l], data_2[m] = data_2[m], data_1[l]
          l += 1
      data[i:i+(j/2)], data[i+(j/2):j-i] = data_1, data_2

  return data

def test_merge_sort():
  """test function for verifying algorithm correctness"""

  import random
  import time

  sample_size = 5000
  sample_data = random.sample(range(sample_size*5), sample_size)
  print 'Sample size: ', sample_size

  begin = time.time()
  sample_data = [5,4,3,2,1]
  result = merge_sort_iterative(sample_data)
  end = time.time()
  expected = sorted(sample_data)
  print 'Sorting time: %f \'secs'%(end-begin)

  assert result == expected, 'Algorithm failed'
  print 'Algorithm correct'

if __name__ == '__main__':
  test_merge_sort()

Upvotes: 2

Divya
Divya

Reputation: 2622

You will need a merge function (the same or almost same merge function) which will be called repeatedly. So, you don't need to change the merge function.

This is a multiple pass solution. Start with a chunk size of 2 and double the chunk size in every pass.

In every pass, partition the list into non-overlapping chunks of size whatever. Split every chunk into 2 and call merge on the 2 parts.

This is a bottom up version.

Upvotes: 4

Related Questions