mahjong
mahjong

Reputation: 13

merge sort in k parts

I'd really appreciate some help on this assignment. The task is develop a merge sort algorithm that "splits" the input array recursively into k arrays. I initially coded two methods to merge sort with two parts, namely merge sort and merge. I tried to generalize this algorithm

public class MergeSort {

    public static int[] mergeSort(int[] a, int p, int r) {
        // p is left bound
        // r is right bound
        if (p < r) {
            int q = (int)Math.floor((p + r) / 2);
            mergeSort(a, p, q);
            mergeSort(a, q + 1, r);
            return merge(a, p, q, r);
        } else
            return a;
    }

    // p ist linke grenze
    // r ist rechte grenze
    public static int[] merge(int[] a, int p, int q, int r) {
        int n1 = q - p + 1; //length of first array
        int n2 = r - q;     //length of second array
        int[] lt = new int[n1 + 1];
        int[] rt = new int[n2 + 1];
        for (int i = 0; i < n1; i++) {
            lt[i] = a[p + i];
        }
        for (int j = 0; j < n2; j++) {
            rt[j] = a[q + j + 1];
        }
        lt[n1] = 1000000000; //sentinels
        rt[n2] = 1000000000;
        int i = 0;
        int j = 0;

        for (int k = p; k <= r; k++) { //comparing the values of the arrays and merging
            if (lt[i] <= rt[j]) {
                a[k] = lt[i];
                i++;
            } else {
                a[k] = rt[j];
                j++;
            }
        }
        return a;
    }

    public static int[] mergeSortK(int[] a, int k, int p, int r) {
        // k number of steps; p is first index of array; r is last index of array;
        if (p < r) {
            int[] pos = new int[k + 1]; //array for saving the indices of the "splits"
            for (int i = 0; i <= k; i++) {
                pos[i] = (int) Math.floor(p + (r - p) / k * i); //saving the array indices
            }
            for (int i = 0; i < k; i++) {
                mergeSortK(a, k, pos[i], pos[i + 1]); //sorting the arrays
            }
            for (int i = 0; i < k - 1; i++) {
                merge(a, pos[i], pos[i + 1], pos[i + 2]); //merging the arrays pairwise
            }
        }
        return a;
    }


    public static void main(String[] args) {
        // task 2.1.a)
        // Example Values:
        int[] list = { 2, 1, 5, 6, 2, 12 };
        int k = 4;

        // use MergeSort
        int[] newlist = mergeSortK(list, k, 0, list.length);
        printList(newlist);
    }

    // Helper function to print the elements of a list
    private static void printList(int[] list) {
        for(int i = 0; i < list.length; i++) {
            System.out.println(list[i]);
        }
    }
}

The input given in the main method results in {2, 1, 2, 5, 6, 12}

Any help is immensely appreciated! Sorry if I'm doing some mistakes, I'm here to learn and I really hope you guys can help me out!

Upvotes: 1

Views: 1303

Answers (1)

chqrlie
chqrlie

Reputation: 145297

There are a few problems in your code:

  • there is no need for Math.floor() to truncate the results of integer division, unlike Javascript, Java uses different semantics for / for int arguments and floating point arguments. (p + r) / 2 is the integral quotient, But you might want to write p + (r - p) / 2 to avoid potential overflow on p + r.
  • the upper index is excluded as you pass list.length from main(). This is actually a very convenient convention to avoid the need for +1 adjustments when computing slice sizes. Remove those erroneous adjustments and rely on the included/excluded convention.
  • don't use sentinels: using sentinels prevents you from correctly sorting arrays containing values greater or equal to the sentinel value 1000000000. This approach is not necessary and should be banned. Just compare the index variables to the slice lengths and copy the remaining elements when one of the slices is exhausted.
  • your computation for the slice boundaries in mergeSortK is incorrect: p + (r - p) / k * i is computed with integer arithmetics so (r - p) / k is rounded before the multiplication. The last slice ending index will not equal r if r - p is not a multiple of k. Multiplying before the division would solve this issue but might overflow the range of type int.
  • mergeSortK does not perform k-way merging, but a series of partial merges that are insufficient for k > 2.
  • your test set is a bit small.

Here is a corrected version:

public class MergeSort {

    public static int[] mergeSort(int[] a, int p, int r) {
        // p is left bound (included)
        // r is right bound (excluded)
        if (r - p >= 2) {
            int q = p - (r - p) / 2;
            mergeSort(a, p, q);
            mergeSort(a, q, r);
            return merge(a, p, q, r);
        } else {
            return a;
        }
    }

    // p is left bound (included)
    // q is start of right slice
    // r is end of right slice (excluded)
    public static int[] merge(int[] a, int p, int q, int r) {
        int n1 = q - p;  // length of first array
        int n2 = r - q;  // length of second array
        int[] lt = new int[n1];
        for (int i = 0; i < n1; i++) {
            lt[i] = a[p + i];
        }
        int i = 0;  // index into lt
        int j = q;  // index into a for right slice
        int k = p;  // index into a for merged list

        while (i < n1 && j < r) { //comparing the values of the arrays and merging
            if (lt[i] <= a[j]) {
                a[k] = lt[i];
                i++;
                k++;
            } else {
                a[k] = a[j];
                j++;
                k++;
            }
        }
        while (i < n1) { // copy remaining elements from right slice
            a[k] = lt[i];
            i++;
            k++;
        }
        // remaining elements from right slice are already in place
        return a;
    }

    public static int[] mergeSortK(int[] a, int k, int p, int r) {
        // k amount of steps; p is first index of slice; r is last index of slice (excluded);
        if (r - p >= 2) {
            if (k > r - p)
                k = r - p;
            int[] pos = new int[k + 1]; //array for saving the indices of the "splits"
            for (int i = 0; i <= k; i++) {
                pos[i] = p + (r - p) * i / k; //saving the array indices
            }
            for (int i = 0; i < k; i++) {
                mergeSortK(a, k, pos[i], pos[i + 1]); //sorting the arrays
            }
            while (k > 1) {
                int i, n = 1;
                for (i = 0; i < k - 1; i += 2) {
                    // merge slices 2 at a time: this will produce the expected output
                    //  but is not a direct k-way merge.
                    merge(a, pos[i], pos[i + 1], pos[i + 2]);
                    pos[n++] = pos[i + 2];
                }
                if (i < k)
                    pos[n++] = pos[i + 1];
                k = n - 1;
            }
        }
        return a;
    }

    public static void main(String[] args) {
        // task 2.1.a)
        // Example Values:
        int[] list = {
            64, 36, 46, 31, 45, 52,  4, 48, 74, 59,
            12, 16, 70, 67, 71, 26, 73, 34, 46, 84,
            60, 16, 26, 68, 56, 57, 97,  6, 39, 74,
            25, 69, 29, 69, 77, 26, 44, 53, 20,  6,
            77, 31, 71, 91, 28,  6, 24, 75, 26, 33,
             3, 20, 55, 94, 17, 81, 88, 32, 94, 32,
             3, 90, 76, 69,  9, 96, 76, 53, 78, 14,
            97, 32, 17, 15, 61, 63, 21,  0, 16, 14,
            61,  4, 81, 86, 29, 29, 27, 57, 85,  5,
            91, 54,  6, 68, 40, 88, 41,  9, 90, 51 };
        int k = 4;  // must be at least 2

        // use MergeSort
        int[] newlist = mergeSortK(list, k, 0, list.length);
        printList(newlist);
    }

    // Helper function to print the elements of a list
    private static void printList(int[] list) {
        for (int i = 0; i < list.length; i++) {
            System.out.println(list[i]);
        }
    }
}

I did not write a correct k-way merging phase at the end of mergeSortK, the code above should work but will merge the k slices in ceil(log2(k)) passes. Direct one pass k-way merging is tricky and usually not worth it.

Upvotes: 1

Related Questions