Ayush Khemka
Ayush Khemka

Reputation: 47

Insertion into a skip list

A skip list is a data structure in which the elements are stored in sorted order and each node of the list may contain more than 1 pointer, and is used to reduce the time required for a search operation from O(n) in a singly linked list to O(lg n) for the average case. It looks like this:

Skip List

Reference: "Skip list" by Wojciech Muła - Own work. Licensed under Public domain via Wikimedia Commons - http://commons.wikimedia.org/wiki/File:Skip_list.svg#mediaviewer/File:Skip_list.svg

It can be seen as an analogy to a ruler:

Ruler markings

In a skip list, searching an element and deleting one is fine, but when it comes to insertion, it becomes difficult, because according to Data Structures and Algorithms in C++: 2nd edition by Adam Drozdek:

To insert a new element, all nodes following the node just inserted have to be restructured; the number of pointers and the value of pointers have to be changed.


I can construe from this that although choosing a node with random number of pointers based on the likelihood of nodes to insert a new element, doesn't create a perfect skip list, it gets really close to it when large number of elements (~9 million for example) are considered.

My question is: Why can't we insert the new element in a new node, determine its number of pointers based on the previous node, attach it to the end of the list, and then use efficient sorting algorithms to sort just the data present in the nodes, thereby maintaining the perfect structure of the skip list and also achieving the O(lg n) insert complexity?

Edit: I haven't tried any code yet, I'm just presenting a view. Simply because implementing a skip list is somewhat difficult. Sorry.

Upvotes: 2

Views: 1572

Answers (3)

Jim Mischel
Jim Mischel

Reputation: 134045

There is no need to modify any following nodes when you insert a node. See the original paper, Skip Lists: A Probabilistic Alternative to Balanced Trees, for details.

I've implemented a skip list from that reference, and I can assure you that my insertion and deletion routines do not modify any nodes forward of the insertion point.

I haven't read the book you're referring to, but out of context the passage you highlight is just flat wrong.

Upvotes: 1

Nick
Nick

Reputation: 10539

You need to have function / method that search for location.

It need to do following:

  1. if you insert unique keys, it need to locate the node. then you keep everything, just change the data (baggage). e.g. node->data = data.

  2. if you allow duplicates, or if key is not found, then this function / method need to give you previous node on each height (lane). Then you determine height of new node and insert it after the found nodes.

Here is my C realisation: https://github.com/nmmmnu/HM2/blob/master/hm_skiplist.c

You need to check following function:

static const hm_skiplist_node_t *_hm_skiplist_locate(const hm_skiplist_t *l, const char *key, int complete_evaluation);

it stores the position inside hm_skiplist_t struct.

complete_evaluation is used to save time in case you need the data and you are not intend to insert / delete.

Upvotes: 0

Ivaylo Strandjev
Ivaylo Strandjev

Reputation: 70989

You have a problem on this point and then use efficient sorting algorithms to sort just the data present in the nodes. Sorting the data will have complexity O(n*lg(n)) and thus it will increase the complexity of insertion. In theory you can choose "perfect" number of links for each node being inserted, but even if you do that, when you perform remove operations, the perfectness will be "broken". Using the randomized approach is close enough to perfect structure to perform well.

Upvotes: 0

Related Questions