GManNickG
GManNickG

Reputation: 503913

Dynamic Array with O(1) removal of any element

This question is about a data structure I thought of. It is a dynamic array, like std::vector<> in C++, except the removal algorithm is different.

In a normal dynamic array, when an element is removed, all the remaining elements must be shifted down, which is O(n), unless it's the last element, which will be O(1).

In this one, if any element is removed, it is replaced by the last element. This of course loses ordering of the elements. But now removal of any element is constant time.

A list will have the same removal times, but this structure has random access. The only caveat with that is you don't know what you're accessing, since ordering could be jumbled, so what use is random access anyway. Plus a list won't mess up any pointers/iterators to elements.

So meh, this structure seems rather useless except for the very specific task of strictly walking through elements and perhaps removing them along the way. A list can do the same, but this has better cache performance.

So, does this strange/useless structure have a name, and does it have any uses? Or just a nice little brain storm?

Upvotes: 7

Views: 3596

Answers (6)

Michael Anderson
Michael Anderson

Reputation: 73490

I dont know of a name for it but it is better than a list in certain cases.

In particular, this would be vastly superior to a singly or doubly linked list for very small data. Because you store everything contiguously there's no extra pointer overhead per element.

Upvotes: 0

Dave Gamble
Dave Gamble

Reputation: 4174

It's called a Set.

Upvotes: -1

Pete Kirkham
Pete Kirkham

Reputation: 49311

So, does this strange/useless structure have a name, and does it have any uses?

I've used something similar in simulations of multi-process systems.

In a scheduler for processes implemented as state machines, each process is either waiting for an external event, active or completed. The scheduler has an array of pointers to the processes.

Initially each process is active, and the scheduler has the index of the last waiting and first completed process, initially zero and the length of the array.

V-- waiting
[ A-active, B-active, C-active, D-active ]
                             completed --^
^- run

To step the process to its next state, the scheduler iterates over the array and runs each process in turn. If a process reports that it is waiting, it's swapped with the process after the last waiting process in the array.

           V-- waiting
[ A-waiting, B-active, C-active, D-active ]
                              completed --^
             ^- run

If it reports that it has completed, it's swapped with the process before the first completed array.

           V-- waiting
[ A-waiting, D-active, C-active, B-completed ]
                   completed --^
             ^- run

So as the scheduler runs and processes transition from active to waiting or completed, the array become ordered with all the waiting processes at the start, all the active ones in the middle, and the completed ones at the end.

                      V-- waiting
[ A-waiting, C-waiting, D-active, B-completed ]
                   completed --^
                        ^- run

After either a certain number of iterations, or when there are no more active processes, the completed processes are cleaned out of the array and external events are processed:

                      V-- waiting
[ A-waiting, C-waiting, D-completed, B-completed ]
          completed --^
                        ^- run == completed so stop

This is similar in that it's using swapping to remove items from a collection, but it is removing items from both ends rather and leaving the 'collection' in the middle.

Upvotes: 3

yairchu
yairchu

Reputation: 24774

I remember using this method plenty of times before. But I don't know a name for it.

Simple example: In a computer game you are iterating all the "bad guys" and calculating their movements etc. One thing that can happen to them is to disappear (their dead body finished fading away and is 99% transparent now). At that point you remove it from the list like you do, and resume iterator without increasing the iteration counter.

Something similar to this is done in a Binary Heap when deleting an item, however there the next step is to maintain the heap rule - O(log n).

Upvotes: 2

Christoffer
Christoffer

Reputation: 12910

Hm, does that algorithm really have O(1) removal time?

That would mean that

  1. Finding the element to remove is O(1)
  2. Finding the last element (which will replace the deleted element) is O(1)
  3. Finding the second-to-last element (the new "last" element) is O(1)

...which is not possible in any data structure I can come up with. Although a double-linked list could fullfill these constraints, given that you've already got a pointer to the element to remove.

Upvotes: -1

Rafał Dowgird
Rafał Dowgird

Reputation: 45091

This idea is used in Knuth (Fisher–Yates) shuffle. An element picked at random is replaced with the last one in the array. Since what we want is a random permutation anyway, the reordering doesn't matter.

Upvotes: 6

Related Questions