Reputation: 10122

An indexed set (for efficient removal in a vector)

I was just about to implement my own class for efficient removal from an array, but thought I'd ask to see if anything like it already exists. What I want is list-like access efficiency but using an array. I want to use an array for reasons of cache coherence and so I don't have to continually be calling a memory allocator (as using std::list would when allocating nodes).

What I thought about doing was creating a class with two arrays. The first is a set of elements and the second array is a set of integers where each integer is a free slot in the first array. So I can add/remove elements from the array fairly easily, without allocating new memory for them, simply by taking an index from the free list and using that for the new element.

Does anything like this exist already? If I do my own, I'll have to also make my own iterators, so you can iterate the set avoiding any empty slots in the array, and I don't fancy that very much.

Thanks.

Note: The kind of operations I want to perform on the set are:

Iteration
Random access of individual elements, by index (or "handle" as I'm thinking of it)
Removal of an element anywhere in the set
Addition of an element to the set (order unimportant)

Upvotes: 5

Answers (4)

Paul Hankin

Reputation: 58211

You can use a single array by storing the information about the "empty" slots in the space of the empty slots.

For a contiguous block of empty slots in your array A, say of k slots starting from index n, store (k, n') at location A[n] (where n' is the index of the next block of free indexes). You may have to pack the two ints into a single word if your array is storing word-sized objects.

You're essentially storing a linked-list of free blocks, like a memory-manager might do.

It's a bit of a pain to code, but this'll allow you to allocate a free index in O(1) time, and to iterate through the allocated indices in O(n) time, where n is the number of allocated slots. Freeing an index will be O(n) time though in the worst case: this is the same problem as fragmented memory.

For the first free block, you can either store the index separately, or have the convention that you never allocate A[0] so you can always start a free-index search from there.

Upvotes: 1

Timothy Shields

Reputation: 79441

std::list<T> actually does sound exactly like the theoretically correct data structure for your job, because it supports the four operations you listed, all with optimal space and time complexity. std::list<T>::iterator is a handle that remains valid even if you add/remove other items to/from the list.

It may be that there is a custom allocator (i.e. not std::allocator<T>) that you could use with std::list<T, Allocator> to get the performance you want (internally pool nodes and then don't do runtime allocation everytime you add or remove a node). But that might be overkill.

I would start just using a std::list<T> with the default allocator and then only look at custom allocators or other data structures if you find the performance is too bad for your application.

Upvotes: 1

Sean Middleditch

Reputation: 2605

If maintaining order of elements is irrelevant, use swap-and-pop.

Copy/move the last element over the one to be removed, then pop the back element. Super easy and efficient. You don't even need to bother with special checks for removing the element since it'll Just Work(tm) if you use the standard C++ vector and operations.

*iter = std::move(container.back());
container.pop_back();

I don't recall if pop_back() invalidated iterators on vector, but I don't think it does. If it does, just use indices directly or to recalculate a new valid iterator.

auto delta = iter - container.begin();
// mutate container
iter = container.begin() + delta;

Upvotes: 1

Chirag Desai

Reputation: 1281

std::map might be useful in your case.

Upvotes: 0

An indexed set (for efficient removal in a vector)

Answers (4)

Related Questions