marcman
marcman

Reputation: 3383

Moving elements from one vector to another using erase-remove paradigm

This question clearly articulates how to move contents from one std::vector to another. In short, there is a std::move call required to physically move the memory, while there is also a std::erase call necessary to resize the original vector to account for the removed elements.

Is there a problem doing this using the erase-remove paradigm as one uses to delete from a vector while iterating over it (like here)?

For example, something like this:

// Vector with values [0, 1, ..., 19]
std::vector<int> myOldVec;
for (int i = 0; i < 20; i++) { myOldVec.push_back(i); }

// New vector to move into
std::vector<int> myNewVec;

// Move from myOldVec to myNewVec if value is less than 5
myOldVec.erase(
    std::remove_if(
        myOldVec.begin(),
        myOldVec.end(),
        [&](const int x)
        {
            if (x < 5)
            {
                myNewVec.push_back(x);
                return true;
            }
            return false;
        }
    ),
    myOldVec.end()
);

The intended output would be

myOldVec --> [5, 6, ..., 19]
myNewVec --> [0, 1, 2, 3, 4]

When I run this code it works in my tester. However, when dealing with objects instead of ints I worry that I'm not actually moving anything, but rather just referencing; for example, when doing the above with std::vector<MyObj> instead (with the appropriate lambda test).

Is this really performing a move, or am I right to be concerned that I'm just sharing a reference?

Upvotes: 4

Views: 7363

Answers (1)

Nir Friedman
Nir Friedman

Reputation: 17704

I think that generally the point of these algorithms, is that you are doing what you want to achieve by applying functions. Once the functions have side-effects, it seems like if anything the net result is maybe misleading, and it might just be better to do a for loop.

That said, remember that C++ is not Java. A vector<Foo> has to store Foo's, it just can't store references. However, there is still a problem with your whole idea.

myNewVec.push_back(x);

This line from your code will push a copy of x into your new vector. Because it's a copy, you don't have to worry about sharing references. Now, for integers, copies and moves are the same. But for complex objects (like, say, a vector), a move can be way, way faster than a copy. And the only vector is anyway getting rid of x, so we definitely want to move. So ideally we would change that line to:

myNewVec.push_back(std::move(x));

However, moving from an object clearly mutates it, and requires it not be const. The requirements for remove_if however require that the function object passed is a predicate. This, in turn means that:

The function object pred shall not apply any non-constant function through the dereferenced iterator.

http://en.cppreference.com/w/cpp/concept/Predicate

In other words, your function must accept the result of dereferencing the iterator, but is not supposed to mutate it. Because it is not supposed to mutate it, you can never move from the original object, but have to make a copy of it. Therefore, I don't think there is any conformant, efficient implementation of this idea for non-trivial types.

Here is a reasonable implementation:

template <class T, class F>
void transfer_if_not(std::vector<T>& old, std::vector<T>& new, F pred)
{
    auto part = std::partition(old.begin(), old.end(), pred);
    std::move(part, old.end(), std::back_inserter(new));
    old.erase(part);
}

This at least should not copy any elements. It will basically separate out the elements to stay and to leave in place in the original vector. Then efficiently move out the ones that are leaving. And then simply resize the array. As pointed out in the comments, this probably involves extra operations compared to what's optimal, but my sense is that the optimal version is not trivial to code up, and may involve tradeoffs (like more state), so it may not be a pure win.

Note that my algorithm accepts a vector specifically instead of iterators, because for other containers (say a linked list), this implementation is quite far from optimal.

Upvotes: 4

Related Questions