Reputation: 11
I've been having an issue that I couldn't debug for quite some time. I am trying to implement a MergeSort algorithm with no additional steps of array copying by following Robert Sedgewick's algorithm in "Algorithm's in C++" book.
Short description of the algorithm:
The recursive program is set up to sort b, leaving results in a. Thus, the recursive calls are written to leave their result in b, and we use the basic merge program to merge those files from b into a. In this way, all the data movement is done during the course of the merges.
The problem is that I cannot find any logical errors but the sorting isn't done properly. Data gets overwritten somewhere and I cannot determine what logical error causes this. The data is sorted when the program is finished but it is not the same data any more.
For example, Input array: { A, Z, W, B, G, C }
produces the array: { A, G, W, W, Z, Z }
.
I can obviously see that it must be a logical error somewhere, but I have been trying to debug this for a pretty long time and I think a fresh set of eyes could maybe see what I'm missing cause I really can't find anything wrong.
Full code I am running:
//Here is my complete code that I run and that behaves as specified above.
#include <iostream>
#include <stdlib.h>
using namespace std;
// function to print the array
void print(char * a[],int l, int r)
{ for(int i=l;i<=r;i++) cout << (i+1) << ": " << a[i] << endl; }
static const int M = 1;
void insertion(char** a, int l, int r)
{ int i,j;
char * temp;
for(i=1;i<r+1;i++)
{ temp = a[i];
j = i;
while((j>0) && strcmp(a[j-1],temp)>0)
{ a[j] = a[j-1];
j = j - 1; }
a[j] = temp; } }
//merging a and b into c
void merge(char ** c,char ** a, int N, char ** b, int M)
{ for (int i=0, j=0, k=0; k<(N+M); k++)
{ if(i == N) {c[k] = b[j++]; continue; }
if(j == M) {c[k] = a[i++]; continue; }
c[k] = strcmp(a[i], b[j])<0 ? a[i++] : b[j++]; } }
void mergesortAux(char ** a, char ** b, int l, int r)
{ if(r-l <= M) { insertion(a, l, r); return; }
int m = (l+r)/2;
mergesortAux(b, a, l, m); //merge sort left
mergesortAux(b, a, m+1, r); //merge sort right
merge(a+l,b+l,m-l+1,b+m+1,r-m); } //merge
void mergesort(char ** a,int l, int r, int size)
{ static char ** aux = (char**)malloc(size*sizeof(char*));
for(int i = l; i<size; i++) aux[i] = a[i];
mergesortAux(a,aux,l,r); }
//free(aux); } I removed this piece of code as suggested, I realize it's unnecessary
int main(int argc, char * argv[])
{ int size = 6;
char **data = (char**)malloc(size*sizeof(char*));
data[0] = "A";
data[1] = "Z";
data[2] = "W";
data[3] = "B";
data[4] = "G";
data[5] = "C";
print(data,0,size-1);
printf("---------------------------\n");
mergesort(data,0,size-1,size);
printf("---------------------------\n");
print(data,0,size-1);
return 0;
}
Output:
1: A 2: Z 3: W 4: B 5: G 6: C --------------------------- --------------------------- 1: A 2: G 3: W 4: W 5: Z 6: Z
Upvotes: 0
Views: 514
Reputation: 35891
Your whole code is completely messed up.
You're confusing the input array with the output array, with an attempt to sort in place all the time. Your code formatting is terrible. Your variable names are terrible. You're using redundant arguments (you're passing size
along with the index range l
and r
). You confuse the meaning of the index ranges. There are so many things wrong with this code, that I could just describe here whole night.
However, only the last one is the critical here:
In this line:
{ if(r-l <= M) { insertion(a, l, r); return; }
You attempt to sort the l
to r
part of the array a
. However, the l
index is not used in the insertion
function so it modifies a
from 0
to r
inclusive. r
suggests that it's the right index, but is used in this method as the array's size.
The simplest fix is to change it to:
if(r-l <= M) { insertion(a+l, l, r-l); return; }
But fixing the insertion
method to handle the indexes properly is probably a better idea.
Also, I strongly recommend you clean up the whole code. This is unmaintainable. This is unreadable. This is the darkest side of C
. I won't ever understand why people can't use understandable function/variable names, use structs to represent composite concepts and partition the logic into small readable parts. As if writing in C
forced you to think that you're setting the individual bits in processor's registers and couldn't use a variable name that is more than 1 character long.
Even you - the author of this code have problems understanding it. This is a strong suggestion that there are some serious problems with it :)
Sorry for the frustration, I couldn't resist the puzzle and wasted a lot of time on this :)
Upvotes: 2