WilliamKF
WilliamKF

Reputation: 43089

How can I insert into a set and a deque

I have a C++ function that needs to insert a range of consecutive integers into a set and for each new element of the set at the end of a dequeue in the same order as the iteration. Below is a solution that is approximately O(log(n) * n) due to the repeated inserts each being O(log(n)). I would like to get an O(n) solution. I would like to use the set::insert() that takes a hint iteration position, but if I do that, I don't see how to determine in constant time whether the item was already in the set or not.

#include <deque>
#include <set>

void
insertUnique(const int beginOffset,
             const int endOffset,
             std::set<int> &sent,
             std::deque<int> &recent)
{
  for (int offset = beginOffset; offset < endOffset; ++offset) {
    const bool inserted = sent.insert(offset).second;

    if (inserted) {
      recent.push_back(offset);
    }
  }
}

Is there a way to refactor this to be O(n) and accomplish the same work while leaving the arguments to the function unchanged? Is there a way to insert with an iterator hint and also know whether or not the item was inserted?

Upvotes: 2

Views: 1063

Answers (1)

Richard Hodges
Richard Hodges

Reputation: 69854

if sent is merely being used to determine whether the integer has been queued, then I would suggest using a std::unordered_set since all insertions and searches have average constant time.

However, unless your set is going to become huge, it's unlikely to make much difference.

In fact, if the number of different integers being recorded is less than ~1000 then you might even get better real performance using a vector, particularly if you keep it sorted - since std::find() uses a binary search which is O(logN) time but without pointer de-references and with good memory locality.

EDIT:

Just for fun, I had a couple of attempts but the problem is that sent is a set<> which has no means of insertion quicker than O(logN).

This one copies the set to an unordered_set (average constant time operation) but the final insert is logN :-(

void
insertUnique_average_constant(const int beginOffset,
                              const int endOffset,
                              std::set<int> &sent,
                              std::deque<int> &recent)
{
    std::unordered_set<int> temp_sent(begin(sent), end(sent));

    for (int offset = beginOffset; offset < endOffset; ++offset) {
        const bool inserted = temp_sent.insert(offset).second;

        if (inserted) {
            recent.push_back(offset);
        }
    }

    sent.insert(begin(temp_sent), end(temp_sent));
}

This one may have some promise, if you can stomach it. It seeks to use set_difference (O2n) to reduce the size of the set of items that must be sent and then cached so in theory it increases in efficiency as the sent set expands.

// a minimal iterator that simply serves the next int in sequence
struct next_index_iterator
{
    using value_type = int;

    next_index_iterator(int value)
    : _value(value)
    {}

    bool operator==(const next_index_iterator& that) const {
        return this->_value == that._value;
    }

    next_index_iterator& operator++() {
        ++_value;
        return *this;
    }

    next_index_iterator operator++(int) {
        return next_index_iterator(_value + 1);
    }

    const int& operator*() const {
        return _value;
    }

private:
    int _value;
};

// necessary for set_difference to compile    
namespace std {
    template<>
    struct iterator_traits<next_index_iterator>
    {
        using value_type = next_index_iterator::value_type;
    };
}


void
insertUnique_sort_of_2N(const int beginOffset,
                              const int endOffset,
                              std::set<int> &sent,
                              std::deque<int> &recent)
{
    std::vector<int> to_send;
    to_send.reserve(endOffset - beginOffset); 

    // set_difference is O(2N)
    std::set_difference(std::begin(sent), std::end(sent),
                        next_index_iterator(beginOffset),
                        next_index_iterator(endOffset),
                        std::back_inserter(to_send));

    // at this point to_send contains only the offsets we have not sent
    for (const auto offset : to_send) {
        recent.push_back(offset);
    }

    // but we still have to insert these into `sent` at the end    
    sent.insert(std::begin(to_send), std::end(to_send));
}

Upvotes: 1

Related Questions