Jendas
Jendas

Reputation: 3579

Remove elements from vector on given indexes, order does not matter

What I have is a vector of elements, I do not care about the order of them. Than I have N indexes (each addressing unique position in the vector) of elements to be removed from the vector. I want the removal be as fast as possible.

Best I could come up with was to store indexes in set (order indexes):

std::set<unsigned int> idxs;
for (int i=0; i<N; ++i)
    idxs.insert(some_index);

And than iterating over the set in reversed order and replacing index to remove by last element of the vector.

std::set<unsigned int>::reverse_iterator rit;
for (rit = idxs.rbegin(); rit != idxs.rend(); ++rit) {
    vec[*rit].swap(vec[vec.size() - 1]);
    vec.resize(vec.size() - 1);
}

However I was thinking whether there is some more efficient way of doing this since the usage of set seems a bit overkill to me and I would love to avoid sorting at all.

EDIT1: Let us assume I use vector and sort it afterwards.

std::vector<unsigned int> idxs;
for (int i=0; i<N; ++i)
    idxs.push_back(some_index);
std::sort(idxs.begin(), idxs.end());

Can I push it any further?

EDIT2: I should have mentioned that the vector will have up to 10 elements. However the removal in my program occurs very often (hundreds of thousands times).

Upvotes: 5

Views: 1151

Answers (2)

JoeG
JoeG

Reputation: 13182

You can use swap/pop_back to remove an item at a given index, and keep track of which indices you've moved with a hash table. It's linear space & time in the number of removals.

std::vector<T> vec = ...;
std::vector<unsigned int> idxs;
std::unordered_map<unsigned int, unsigned int> map;

for(auto index : idxs) {
  unsigned int trueIndex = index;
  while (trueIndex >= vec.size()) {
    trueIndex = map[trueIndex];
  }

  // element at index 'vec.size()-1' is being moved to index 'index'   
  map[vec.size()-1] = index; 
  swap(vec[trueIndex], vec[vec.size()-1]);
  vec.pop_back();   
}

Upvotes: 0

peterchen
peterchen

Reputation: 41096

set is a good choice. I'd guess using another allocator (e.g. arena)would have the biggest impact. Why not use a set instead of a vector of elements to begin with?

I see the following relevant variations:

  • Instead of remove, create a new vector and copy preserved elements, then swap back.
    This keeps your indices stable (unlike removal, which would require sorting or updating the indices).

  • Instead of a vector of indices, use a vector of bools of the same length as your data. With the length of "maximum 10" given, a bit mask seems sufficient

So, roughly:

struct Index 
{
   DWORD removeMask = 0;  // or use bit vector for larger N
   void TagForRemove(int idx) { removeMask |= (1<<idx); }
   boll DoRemove(int idx) const { return (removeMask & (1<<idx)) != 0; }
}

// create new vector, or remove, as you like
void ApplyRemoveIndex(vector<T> & v, Index remove)
{
   vector<T> copy;
   copy.reserve(v.size());
   for (i=0..v.size())
     if (!remove.DoRemove(i))
       copy.push_back(v[i]);
   copy.swap(v);
}

Upvotes: 1

Related Questions