user173342
user173342

Reputation: 1840

Sorted data structure for in-order iteration, ordered push, and removal (N elements only from top)

What is considered an optimal data structure for pushing something in order (so inserts at any position, able to find correct position), in-order iteration, and popping N elements off the top (so the N smallest elements, N determined by comparisons with threshold value)? The push and pop need to be particularly fast (run every iteration of a loop), while the in-order full iteration of the data happens at a variable rate but likely an order of magnitude less often. The data can't be purged by the full iteration, it needs to be unchanged. Everything that is pushed will eventually be popped, but since a pop can remove multiple elements there can be more pushes than pops. The scale of data in the structure at any one time could go up to hundreds or low thousands of elements.

I'm currently using a std::deque and binary search to insert elements in ascending order. Profiling shows it taking up the majority of the time, so something has got to change. std::priority_queue doesn't allow iteration, and hacks I've seen to do it won't iterate in order. Even on a limited test (no full iteration!), the std::set class performed worse than my std::deque approach.

None of the classes I'm messing with seem to be built with this use case in mind. I'm not averse to making my own class, if there's a data structure not to be found in STL or boost for some reason.

edit:

There's two major functions right now, push and prune. push uses 65% of the time, prune uses 32%. Most of the time used in push is due to insertion into the deque (64% out of 65%). Only 1% comes from the binary search to find the position.

template<typename T, size_t Axes>
void Splitter<T, Axes>::SortedData::push(const Data& data) //65% of processing
{
 size_t index = find(data.values[(axis * 2) + 1]);

 this->data.insert(this->data.begin() + index, data); //64% of all processing happens here
}

template<typename T, size_t Axes>
void Splitter<T, Axes>::SortedData::prune(T value) //32% of processing
{
 auto top = data.begin(), end = data.end(), it = top;

 for (; it != end; ++it)
 {
  Data& data = *it;

  if (data.values[(axis * 2) + 1] > value) break;
 }

 data.erase(top, it);
}

template<typename T, size_t Axes>
size_t Splitter<T, Axes>::SortedData::find(T value)
{
 size_t start = 0;
 size_t end = this->data.size();

 if (!end) return 0;

 size_t diff;

 while (diff = (end - start) >> 1)
 {
  size_t mid = diff + start;

  if (this->data[mid].values[(axis * 2) + 1] <= value)
  {
   start = mid;
  }
  else
  {
   end = mid;
  }
 }

 return this->data[start].values[(axis * 2) + 1] <= value ? end : start;
}

Upvotes: 5

Views: 1905

Answers (4)

ltjax
ltjax

Reputation: 15997

With your requirements, a hybrid data-structure tailored to your needs will probably perform best. As others have said, continuous memory is very important, but I would not recommend keeping the array sorted at all times. I propose you use 3 buffers (1 std::array and 2 std::vectors):

  • 1 (constant-size) Buffer for the "insertion heap". Needs to fit into the cache.
  • 2 (variable-sized) Buffers (A+B) to maintain and update sorted arrays.

When you push an element, you add it to the insertion heap via std::push_heap. Since the insertion heap is constant size, it can overflow. When that happens, you std::sort it backwards and std::merge it with the already sorted-sequence buffer (A) into the third (B), resizing them as needed. That will be the new sorted buffer and the old one can be discarded, i.e. you swap A and B for the next bulk operation. When you need the sorted sequence for iteration, you do the same. When you remove elements, you compare the top element in the heap with the last element in the sorted sequence and remove that (which is why you sort it backwards, so that you can pop_back instead of pop_front).

For reference, this idea is loosely based on sequence heaps.

Upvotes: 2

Grimm The Opiner
Grimm The Opiner

Reputation: 1806

From your edit, it sounds like the delay is in copying - is it a complex object? Can you heap allocate and store pointers in the structure so each entry is created once only; you'll need to provide a custom comparitor that takes pointers, as the objects operator<() wouldn't be called. (The custom comparitor can simply call operator<())

EDIT: Your own figures show it's the insertion that takes the time, not the 'sorting'. While some of that insertion time is creating a copy of your object, some (possibly most) is creation of the internal structure that will hold your object - and I don't think that will change between list/map/set/queue etc. IF you can predict the likely eventual/maximum size of your data set, and can write or find your own sorting algorithm, and the time is being lost in allocating objects, then vector might be the way to go.

Upvotes: 0

comocomocomocomo
comocomocomocomo

Reputation: 4942

You save time with the binary search, but the insertion in random positions of the deque is slow. I would suggest an std::map instead.

Upvotes: 0

Dino
Dino

Reputation: 609

Have you tried messing around with std::vector? As weird as it may sound it could be actually pretty fast because it uses continuous memory. If I remember correctly Bjarne Stroustrup was talking about this at Going Native 2012 (http://channel9.msdn.com/Events/GoingNative/GoingNative-2012/Keynote-Bjarne-Stroustrup-Cpp11-Style but I'm not 100% sure that it's in this video).

Upvotes: 0

Related Questions