Reputation: 825
I want to keep a linked list in sorted order when inserting elements (about 200000 elements in the list), which algorithm can you recommend? I made a simple implementation using insertion sort, but its performance is very very bad (a lot of CPU usage).
Thanks for your help.
I did some comparison between merge sort and insertion sort but it seems that insertion sort has better performance, I am a bit confused by this result. Can you tell me what's wrong and if there is a better algorithm?
My code (for simplicity, I omitted the prev node in the node struct):
struct node {
int number;
struct node *next;
};
Insertion sort :
void insert_node(int value) {
struct node *new_node = NULL;
struct node *cur_node = NULL;
struct node *last_node = NULL;
int found; /* 1 means found a place to insert the new node in, 0 means not*/
new_node = (struct node *)malloc(sizeof(struct node *));
if(new_node == NULL) {
printf("memory problem\n");
}
new_node->number = value;
/* If the first element */
if (head == NULL) {
new_node->next = NULL;
head = new_node;
}
else if (new_node->number < head->number) {
new_node->next = head;
head = new_node;
}
else {
cur_node = head;
found = 0;
while (( cur_node != NULL ) && ( found == 0 )) {
if( new_node->number < cur_node->number )
{
found = 1;
}
else
{
last_node = cur_node;
cur_node = cur_node->next;
}
}
/* We got the right place to insert our node */
if( found == 1 )
{
new_node->next = cur_node;
}
/* Insert at the tail of the list */
else
{
last_node->next = new_node;
new_node->next = NULL;
}
}
Merge Sort :
/* add a node to the linked list */
struct node *addnode(int number, struct node *next) {
struct node *tnode;
tnode = (struct node*)malloc(sizeof(*tnode));
if(tnode != NULL) {
tnode->number = number;
tnode->next = next;
}
return tnode;
}
/* perform merge sort on the linked list */
struct node *merge_sort(struct node *head) {
struct node *head_one;
struct node *head_two;
if((head == NULL) || (head->next == NULL))
return head;
head_one = head;
head_two = head->next;
while((head_two != NULL) && (head_two->next != NULL)) {
head = head->next;
head_two = head->next->next;
}
head_two = head->next;
head->next = NULL;
return merge(merge_sort(head_one), merge_sort(head_two));
}
/* merge the lists.. */
struct node *merge(struct node *head_one, struct node *head_two) {
struct node *head_three;
if(head_one == NULL)
return head_two;
if(head_two == NULL)
return head_one;
if(head_one->number < head_two->number) {
head_three = head_one;
head_three->next = merge(head_one->next, head_two);
} else {
head_three = head_two;
head_three->next = merge(head_one, head_two->next);
}
return head_three;
}
Upvotes: 4
Views: 7970
Reputation: 349
I have made the necessary changes in your code. I have tested it and it works perfectly fine. You should be able to close the query now.
struct node *insert_node( struct node *head, int *value )
{
struct node *new_node = NULL;
struct node *cur_node = NULL;
struct node *last_node = NULL;
int found; /* 1 means found a place to insert the new node in, 0 means not*/
new_node = (struct node *)malloc(sizeof(struct node *));
if(new_node == NULL)
{
printf("memory problem\n");
}
new_node->number = *value;
/* If the first element */
if (head == NULL)
{
new_node->prev = new_node->next = NULL;
head = new_node;
}
else if (new_node->number < head->number)
{
new_node->next = head;
head = new_node;
}
else
{
cur_node = head;
found = 0;
while (( cur_node != NULL ) && ( found == 0 ))
{
if( new_node->number < cur_node->number )
found = 1;
else
{
last_node = cur_node;
cur_node = cur_node->next;
}
}
/* We got the right place to insert our node */
if( found == 1 )
{
new_node->next = cur_node;
new_node->prev = last_node;
last_node->next = new_node;
cur_node->prev = new_node;
}
/* Insert at the tail of the list */
else
{
last_node->next = new_node;
new_node->next = NULL;
new_node->prev = last_node;
}
}
return head;
}
Upvotes: 0
Reputation: 5794
I like heaps for things like this. Insertion and lookup is O(log(n)) and traversal is O(n). The 'heapness' property ensures the data are sorted at all times. You can traverse forwards and backwards so you get the advantages of a doubly-linked list.
Upvotes: 0
Reputation: 5370
actually you don't want to sort the list. Merge Sort is used to sort unsorted list. (some here already pointed that out).
To keep a list sorted you have to insert each element at the right position. so basically you have to:
The complexity here is the search algorithm.
So a binary tree or even better a AVL tree (http://en.wikipedia.org/wiki/AVL_tree) would be very effective.
Another option would be to use binary search (http://en.wikipedia.org/wiki/Binary_search_algorithm). But again this is only effective with a skip list.
So either way you change your data structure or with a simple double linked list a O(N) complexity is the best you can achieve.
Upvotes: 0
Reputation: 27536
You are not correctly implementing the merge sort which is based on recursively dividing the list into two parts, sorting them and merging the result. But in your code, you don't really divide the list into two halves.
Notice that in the lines:
while((head_two != NULL) && (head_two->next != NULL)) {
head = head->next;
head_two = head->next->next;
}
head_two = head->next;
head->next = NULL;
you exit the while loop when head_two reaches the end of the list: If for example you reach head_two->next == NULL
at the loop, then you exit it with head->next->next == NULL
. And when you run head_two = head->next;
, you get a head_two
such that head_two->next == NULL
which is the last item in the list.
That means that you are basically doing an insertion sort and not a merge sort.
So try to keep track of the length of the list, by adding a parameter length
to the function merge_sort
to be able to split it into 2. Here is a good explanation of the algorithm in wikipedia.
Upvotes: 1
Reputation: 169713
If you need to insert m
elements into a sorted list of size n
, you can either do a straight-forward insert of complexity m·n
, or presort the elements (I suggest merge sort for that) and merge them into the original list, which has complexity m·ln(m) + (n + m)
.
Basically, you'll be using specialized versions of insertion and merge sort, which can both be implemented as online algorithms and are thus well-suited for linked list. Keep in mind that the 'textbook' implementation of merge sort doesn't perform well as it unnecessarily iterates the list when dividing it into sublist: you need the slightly more complex stack-based version...
Upvotes: 0
Reputation:
For an online solution (insert the items as they arrive), a balanced binary tree can be a good option. It allows insertions (and also deletions) in time O(Log(N)).
Otherwise, MergeSort can be applied to the full list.
In both cases, O(N.Log(N)) comparisons in total.
Upvotes: 2
Reputation: 54772
Priority queues (or Skip Lists specifically) are what you are looking for. They allow logarithmic inserts instead of O(n) for simple linked lists.
Here's a skip list implementation in C on GitHub.
Upvotes: 0
Reputation: 14705
To insert elements in a linked list, the precondition that it is sorted does not help! There is no algorithm to help.
You might want to consider another structure for your elements. Depending on your situation a simple heap or a binary search tree might serve you well.
If you want to insert a lot of elements into a large sorted linked list you can sort them and then do a very fast merge O(N).
Upvotes: 6
Reputation: 7392
A linked list necessarily requires O(N) steps to move to an arbitrary position in the list. So, it sounds like it is not the right data structure for you.
You can try to use a sorted set - std::set in C++ or SortedSet<> in C#.
If the sorted set abstraction does not fit your problem, probably the simplest data structure that is similar to a sorted linked list is a skip list: http://igoro.com/archive/skip-lists-are-fascinating/. The more complicated alternatives would be AVL trees and red-black trees.
Upvotes: 0