Leonel
Leonel

Reputation: 213

Efficient data structure for fast random access, search, insertion and deletion

I'm looking for a data structure (or structures) that would allow me keep me an ordered list of integers, no duplicates, with indexes and values in the same range.

I need four main operations to be efficient, in rough order of importance:

  1. taking the value from a given index
  2. finding the index of a given value
  3. inserting a value at a given index
  4. deleting a value at a given index

Using an array I have 1 at O(1), but 2 is O(N) and insertion and deletions are expensive (O(N) as well, I believe).

A Linked List has O(1) insertion and deletion (once you have the node), but 1 and 2 are O(N) thus negating the gains.

I tried keeping two arrays a[index]=value and b[value]=index, which turn 1 and 2 into O(1) but turn 3 and 4 into even more costly operations.

Is there a data structure better suited for this?

Upvotes: 18

Views: 29255

Answers (8)

EvilTeach
EvilTeach

Reputation: 28882

Use a vector for the array access.

Use a map as a search index to the subscript into the vector.

  • given a subscript fetch the value from the vector O(1)
  • given a key, use the map to find the subscript of the value. O(lnN)
  • insert a value, push back on the vector O(1) amortized, insert the subscript into the map O(lnN)
  • delete a value, delete from the map O(lnN)

Upvotes: 1

Jarek Czekalski
Jarek Czekalski

Reputation: 11

How to achieve 2 with RB-trees? We can make them count their children with every insert/delete operations. This doesn't make these operationis last significantly longer. Then getting down the tree to find the i-th element is possible in log n time. But I see no implementation of this method in java nor stl.

Upvotes: 1

lothar
lothar

Reputation: 20257

How about using a sorted array with binary search?

Insertion and deletion is slow. but given the fact that the data are plain integers could be optimized with calls to memcpy() if you are using C or C++. If you know the maximum size of the array, you can even avoid any memory allocations during the usage of the array, as you can preallocate it to the maximum size.

The "best" approach depends on how many items you need to store and how often you will need to insert/delete compared to finding. If you rarely insert or delete a sorted array with O(1) access to the values is certainly better, but if you insert and delete things frequently a binary tree can be better than the array. For a small enough n the array most likely beats the tree in any case.

If storage size is of concern, the array is better than the trees, too. Trees also need to allocate memory for every item they store and the overhead of the memory allocation can be significant as you only store small values (integers).

You may want to profile what is faster, the copying of the integers if you insert/delete from the sorted array or the tree with it's memory (de)allocations.

Upvotes: 4

Ayman Hourieh
Ayman Hourieh

Reputation: 137416

I would use a red-black tree to map keys to values. This gives you O(log(n)) for 1, 3, 4. It also maintains the keys in sorted order.

For 2, I would use a hash table to map values to keys, which gives you O(1) performance. It also adds O(1) overhead for keeping the hash table updated when adding and deleting keys in the red-black tree.

Upvotes: 17

Zifre
Zifre

Reputation: 27008

I like balanced binary trees a lot. They are sometimes slower than hash tables or other structures, but they are much more predictable; they are generally O(log n) for all operations. I would suggest using a Red-black tree or an AVL tree.

Upvotes: 1

BenAlabaster
BenAlabaster

Reputation: 39874

If you're working in .NET, then according to the MS docs http://msdn.microsoft.com/en-us/library/f7fta44c.aspx

  • SortedDictionary and SortedList both have O(log n) for retrieval
  • SortedDictionary has O(log n) for insert and delete operations, whereas SortedList has O(n).

The two differ by memory usage and speed of insertion/removal. SortedList uses less memory than SortedDictionary. If the SortedList is populated all at once from sorted data, it's faster than SortedDictionary. So it depends on the situation as to which is really the best for you.

Also, your argument for the Linked List is not really valid as it might be O(1) for the insert, but you have to traverse the list to find the insertion point, so it's really not.

Upvotes: 0

Rob Hruska
Rob Hruska

Reputation: 120456

I don't know what language you're using, but if it's Java you can leverage LinkedHashMap or a similar Collection. It's got all of the benefits of a List and a Map, provides constant time for most operations, and has the memory footprint of an elephant. :)

If you're not using Java, the idea of a LinkedHashMap is probably still suitable for a usable data structure for your problem.

Upvotes: 1

CookieOfFortune
CookieOfFortune

Reputation: 14004

Howabout a Treemap? log(n) for the operations described.

Upvotes: 0

Related Questions