Reputation: 11441
I am reading about merge sort in Algorithms in C++ by Robert Sedgewick and have following questions.
static void mergeAB(ITEM[] c, int cl, ITEM[] a, int al, int ar, ITEM[] b, int bl, int br )
{
int i = al, j = bl;
for (int k = cl; k < cl+ar-al+br-bl+1; k++)
{
if (i > ar) { c[k] = b[j++]; continue; }
if (j > br) { c[k] = a[i++]; continue; }
c[k] = less(a[i], b[j]) ? a[i++] : b[j++];
}
}
The characteristic of the basic merge that is worthy of note is that the inner loop includes two tests to determine whether the ends of the two input arrays have been reached. Of course, these two tests usually fail, and the situation thus cries out for the use of sentinel keys to allow the tests to be removed. That is, if elements with a key value larger than those of all the other keys are added to the ends of the a and aux arrays, the tests can be removed, because when the a (b) array is exhausted, the sentinel causes the next elements for the c array to be taken from the b (a) array until the merge is complete.
However, it is not always easy to use sentinels, either because it might not be easy to know the largest key value or because space might not be available conveniently.
For merging, there is a simple remedy. The method is based on the following idea: Given that we are resigned to copying the arrays to implement the in-place abstraction, we simply put the second array in reverse order when it is copied (at no extra cost), so that its associated index moves from right to left. This arrangement leads to the largest element—in whichever array it is—serving as sentinel for the other array.
My questions on above text
What does statement "when the a (b) array is exhausted"? what is 'a (b)' here?
Why is the author mentioning that it is not easy to determine the largest key and how is the space related in determining largest key?
What does author mean by "Given that we are resigned to copying the arrays"? What is resigned in this context?
Request with simple example in understanding idea which is mentioned as simple remedy?
Upvotes: 1
Views: 552
Reputation: 114599
Sometimes parenthesis are used to tell one or more similar phrases at once:
when a (b) is exhausted we copy elements from b (a)
means:
when a is exhausted we copy elements from b, when b is exhausted we copy elements from a
Two annoying things about sentinels are
We programmers are never happy to copy (or move) things around and leaving them where they already are is, if possible, better (because we are lazy). In this version of the merge sort we already gave up about trying to not copy things around... we resigned to it. Given that we must copy, we can copy things in the opposite order if we like (and of course use the copy in opposite order) because that is free(*).
(*) is free at this level of abstraction, the cost on some real CPU may be high. As almost always in the performance area YMMV.
Upvotes: 1
Reputation: 755064
"When the a (b) array is exhausted" is a shorthand for "When either the a
array or the b
array is exhausted".
The interface is dealing with sub-arrays of a bigger array, so you can't simply go writing beyond the ends of the arrays.
The code copies the data from two arrays into one other array. Since this copy is inevitable, we are 'resigned to copying the arrays' means we reluctantly accept that it is inevitable that the arrays must be copied.
Tricky...that's going to take some time to work out what is meant.
Tangentially: That's probably not the way I'd write the loop. I'd be inclined to use:
int i = al, j = bl;
for (int k = cl; i <= ar && j <= br; k++)
{
if (a[i] < b[j])
c[k] = a[i++];
else
c[k] = b[j++];
}
while (i <= ar)
c[k++] = a[i++];
while (j <= br)
c[k++] = b[j++];
One of the two trailing loops does nothing. The revised main merge loop has 3 tests per iteration versus 4 tests per iteration for the one original algorithm. I've not formally measured it, but the simpler merge loop is likely to be quicker than the original single-loop algorithm.
The first three questions are almost best suited for English Language Learners.
Upvotes: 3