Reputation: 120
Preface:
I do not care about the viability of my particular quick sort algorithm. That is not what the question is about. I am curious as to why the program I have already written behaves the way it does, not about why I should use the STL or whatever. This is for learning purposes. (for instance, I'm aware picking pivot as the head is not the best case -- that's not what I'm asking though)
Question:
I have a linked list composed of Node*
s. I've written a quick sort algorithm for it specifically. It works by taking a given linked list, and recursively splitting it into sublists based on a chosen pivot
. The pivot is always the head
(front) of the list, and the sublists left
and right
are lists that contain values less than and greater than/equal to the pivot, respectively. I've almost gotten it down, but I don't know how to properly add_link the pivots. In a previous iteration, I would just add_link the pivot to the right sublist always, but that caused an infinite loop, so I ended up skipping the pivots entirely when comparing values against it. The resulting list is sorted, but it is missing every value that was used for a pivot. How should I fix this?
Here is a minimal reproducible example:
#include <iostream>
struct Link {
Link(int d=int(), Link* n=nullptr) {
data = d;
next = n;
}
int data;
Link* next;
};
struct LinkList {
Link* sentinel;
size_t size;
LinkList(Link* h=new Link, size_t s=0) {
size = s;
sentinel = h;
}
~LinkList() {
Link* current = sentinel;
while (current != 0) {
Link* next = current->next;
delete current;
current = next;
}
sentinel = 0;
}
};
// Prototypes
Link* sort_q(Link* sentinel);
void divide(Link* sentinel, Link* cmp, Link*& lower, Link*& greater);
Link* concatenate(Link* lower, Link* greater);
// helpful function calls quick sort
void sort(LinkList& l) {
l.sentinel = sort_q(l.sentinel);
}
// returns false if there sentinel = null; true if add_link was successful
bool add_link(Link* sentinel, Link* add_link, Link*& end) {
if (sentinel == nullptr) {
return false;
}
Link* curr = sentinel;
for (; curr->next; curr = curr->next) {}
curr->next = add_link;
end = curr->next;
return true;
}
Link* sort_q(Link* sentinel) {
Link* lower = nullptr;
Link* greater = nullptr;
Link* cmp = sentinel;
// base case LinkList = null or sentinel->null
if (sentinel == nullptr || sentinel->next == nullptr) {
return sentinel;
}
divide(sentinel, cmp, lower, greater);
lower = sort_q(lower);
greater = sort_q(greater);
return concatenate(lower, greater);
}
void divide(Link* sentinel, Link* cmp, Link*& lower, Link*& greater) {
lower = new Link, greater = new Link;
// lend is pointer to end of lower subLinkList
// rend is pointer to end of greater subLinkList
Link* lend = nullptr, * rend = nullptr;
// loop through LinkList until end
while (sentinel != nullptr) {
if (sentinel == cmp) {
sentinel = sentinel->next; continue;
}
if (sentinel->data < cmp->data) {
// break current link
Link* tmp = sentinel;
sentinel = sentinel->next;
tmp->next = nullptr;
// if subLinkList is not empty, add_link current Link to subLinkList and update end pointer
if (add_link(lend, tmp, lend))
continue;
// otherwise, "add_link" current Link to empty subLinkList and update end pointer manually
lower->next = tmp;
lend = lower->next;
}
else {
// break current link
Link* tmp = sentinel;
sentinel = sentinel->next;
tmp->next = nullptr;
// if subLinkList is not empty, add_link current Link to subLinkList and update end pointer
if (add_link(rend, tmp, rend))
continue;
// otherwise, "add_link" current Link to empty subLinkList and update end pointer manually
greater->next = tmp;
rend = greater->next;
}
}
// remove dummy Link(s)
if (lower->next)
lower = lower->next;
else
lower = cmp;
if (greater->next)
greater = greater->next;
else
greater = cmp;
// unlink cmp
cmp->next = nullptr;
}
// connected subLinkLists
Link* concatenate(Link* lower, Link* greater) {
Link* sentinel;
sentinel = lower;
while (lower->next != nullptr) {
lower = lower->next;
}
lower->next = greater;
return sentinel;
}
void print(LinkList &l) {
for (Link* n = l.sentinel; n != NULL; n = n->next) {
std::cout << n->data << '\n';
}
}
int main() {
// set up linked LinkList 8->4->5->11->7->5->3->9->null
Link* sentinel = new Link(8 , new Link(4, new Link(5, new Link(11, new Link(7, new Link(5, new Link(3, new Link(9))))))));
LinkList l(sentinel,5);
sort(l);
print(l);
return 0;
}
The output I want to have is
3
4 // pivot missing from output
5
5
7
8 // pivot missing from output
9
11
but it outputs
3
5
5
7
9
11
Edit #1:
I have tried adding the pivot between concatenations, but that does not work either. It produces different results, but not quite the same. Modifying sort_q()
and partition()
like so:
Link* qsort(Link* sentinel) {
// ... other stuff before modification
return concatenate(lower, cmp); // changed greater argument to pass cmp which is now cmp->next = greater once finishing partition()
}
void partition(Link* sentinel, Link*& cmp, Link*& lower, Link*& greater) {
// .. other stuff before modifications
// remove dummy Link
if (lower->next)
lower = lower->next;
if (greater->next)
greater = greater->next;
cmp->next = greater; // cmp points to greater (greater) sublist
}
The output becomes
3
4
5
7
0
8
11
0
Upvotes: 1
Views: 188
Reputation: 9005
You have assumptions that you can get both a lower and a greater list, and you already make sure you have at least 2 items in the list before you call divide, so that is good.
You just need to handle all the cases at the end of divide. Make sure cmp ends up in one of the two lists. Almost like you had before, but you need to handle the case where both lower and greater lists are non-empty.
// remove dummy node(s)
cmp->next = nullPtr;
if (lower->next && greater->next) {
// we have two lists. put cmp on greater
lower = lower->next;
cmp->next = greater->next;
greater = cmp;
}
else if (lower->next) {
// only a lower list, use cmp on greater
lower = lower->next;
greater = cmp;
}
else if (greater->next) {
// only a greater list, use cmp as lower.
greater = greater->next;
lower = cmp;
}
seeing the above, handling all 3 cases, it can be simplified to this:
// remove dummy node(s)
if (lower->next) {
// we have lower node, so put cmp on greater
lower = lower->next;
cmp->next = greater->next;
greater = cmp;
}
else if (greater->next) {
// only a greater list, use cmp as lower.
greater = greater->next;
lower = cmp;
cmp->next = nullPtr;
}
then use concatenate(lower,greater)
afterwards. Although it could be optimized to have divide
concatenate the lists and just return the sentinel, but that's a bit more of a rewrite.
EDIT: putting it all together to eliminate your memory leak, it should be like this (note, I have not compiled or tested)
void divide(Node* sentinel, Node* cmp, Node*& lower, Node*& greater) {
lower = nullptr, greater = nullptr;
// lend is pointer to end of lower sublist
// rend is pointer to end of greater sublist
Node* lend = nullptr, * rend = nullptr;
// loop through list until end
while (sentinel != nullptr) {
if (sentinel == cmp) {
sentinel = sentinel->next; continue;
}
if (sentinel->data < cmp->data) {
// break current link
Node* tmp = sentinel;
sentinel = sentinel->next;
tmp->next = nullptr;
// if sublist is not empty, append current node to sublist and update end pointer
if (append(lend, tmp, lend))
continue;
// otherwise, "append" current node to empty sublist and update end pointer manually
lend = lower = tmp;
}
else {
// break current link
Node* tmp = sentinel;
sentinel = sentinel->next;
tmp->next = nullptr;
// if sublist is not empty, append current node to sublist and update end pointer
if (append(rend, tmp, rend))
continue;
// otherwise, "append" current node to empty sublist and update end pointer manually
rend = greater = tmp;
}
}
// insert cmp node
if (lower) {
// we have lower node, so put cmp on greater
cmp->next = greater;
greater = cmp;
}
else if (greater) {
// only a greater list, use cmp as lower.
lower = cmp;
cmp->next = nullptr;
}
}
Upvotes: 2
Reputation: 16869
The main problem I see is that you sometimes, but not always, add the pivot to the "left" or "right" list in partition()
. This inconsistency likely violates the desired functionality of partition()
. However, this functionality is not documented. So I'll revise my assessment to say the main problem is that you have not documented your functions. The bigger your project, the more likely you are to encounter problems because of the lack of documentation.
Document!
I would expect to see partition()
documented to say that it partitions the list into three pieces: left, right, and pivot.
/// Partitions the list starting at `head` into three pieces. Data less than
/// the given partition's data is put in the `left` list, while the remaining
/// data (except `*partition` itself) is put in the `right` list. The third
/// piece consists of `*partition`, which is assumed to be in the given list.
void partition(Node* head, Node* pivot, Node*& left, Node*& right) {
Another option might be to immediately put the pivot in one of these lists. However, these lists are going to be sorted immediately after the call to partition()
, and there is no need to include *partition
in that sort. Sort left, sort right, then concatenate left, partition, and right. In addition to being faster because there are fewer nodes to sort in the next recursive call, keeping the partition node out of your lists will guarantee that your lists get shorter with each recursive call.
It's your call, though, since you are the designer. The important thing is to decide what functionality you want, document it, and follow that decision.
Enforce consistency!
Once you have specified functionality, it is easier to verify that your code respects whatever design decisions you have made. Your sort_q()
function fails to account for the above specification in that it does not add the partition to the list. There is probably a simpler way to do this than the below, but the below serves the purpose with little re-writing of your code. (This is the end of sort_q()
.)
// ** Return left -> pivot -> right, not just left -> right **
return concatenate(left, concatenate(pivot, right));
}
In addition, your partition()
function violates this specification by (sometimes) adding the partition node to one of your lists. Stop doing that. Set your links to null instead.
// remove dummy node(s)
if (left->next)
left = left->next;
else
left = nullptr; // <-- null, not pivot
if (right->next)
right = right->next;
else
right = nullptr; // <-- null, not pivot
Hmm... I had avoided re-writing your code earlier, but this particular structure strikes me as needlessly complicated. If a certain pointer is not null, assign it to something. Otherwise (when that pointer is null), assign null instead of the pointer (which is null, so same thing). That is rather wordy. Just assign the pointer.
// remove dummy node(s)
left = left->next;
right = right->next;
// *** Memory leak, as we removed the nodes without deleting them ***
Oh that memory leak is a pain. Especially since you don't really need the dummy nodes. But that digresses into a separate question, so for now I'll just note that your code is very close to working if you had initialized left
and right
to nullptr
. (Your hint is that if left
is initialized to nullptr
then, at the moment append()
is called, lend
is null if and only if left
is null.)
One final piece: if you've followed all this so far, you probably ran into a segmentation fault at some point. You are missing a check in concatenate()
(maybe the reason you started using dummy nodes?)
// connected sublists
Node* concatenate(Node* left, Node* right) {
if ( left == nullptr ) // <-- Add this check
return right; // <-- It is very easy to concatenate to nothing.
Node* head = left;
That should address the immediate question. There are other places to improve, though, so keep working at it. (There are other parts of your code that I would call "needlessly complicated", albeit less severely – and less obviously – than what that if
statement became.)
Upvotes: 1