Mark
Mark

Reputation: 981

Unordered Iteration of a Container

My intent is to mask the implementation details of the container such that the client is not permitted to rely on the implicit insertion order. I'm trying to enforce this by somehow altering the order in which iteration occurs.

I have a container which I want to be randomly ordered when iterating. Here is some pseudo code.

namespace abc
{
template<class T>
class RandomList
{
  void insert(T t);
  T erase(T t);
  iterator begin();
  iterator end();
}
}

namespace test
{
  int main()
  {
    RandomList<int> list;
    list.insert(1);
    list.insert(2);
    list.insert(3);

    for (typename RandomList::iterator iter = list.begin();
         iter != list.end(); ++iter)
    {
       std::cout << iter << std::endl;
    }
  }    
}

Output:

3, 1, 2

My question is, what is the best way to implement RandomList. My naive thought is to just hold a member std::list and do rand() to determine whether insert does front inserts or back inserts.

Any other ideas?

I'm using mostly C++03, but I have access to boost.

Upvotes: 5

Views: 102

Answers (3)

mksteve
mksteve

Reputation: 13073

I would avoid random insertions, and randomize iteration. The idea that a large container is correct, will be difficult to see visually, and understanding insertion issues difficult

Upvotes: 0

sperber
sperber

Reputation: 661

I'm not sure if I understand all aspects of your problem. I think what qualifies as a good solution largely depends on the application and what the type T is going to be in practice. In any case, I suggest that you alter the interface of RandomList (see below).

First, your idea to use std::list internally and randomly insert at front or back seems to be a possible solution.

I do not think that it is a good idea to use std::vector if you aim for a broad range of applications. The reason is that std::vector implementations usually do a lot of extra stuff internally that may harm the performance/memory requirements of your container type. Usually std::vector uses a memory buffer that is larger than the actual size of the vector to avoid allocations on subsequent calls of insertion operations like push_back. Therefore, the performance of insertions may vary on subsequent calls (suddenly the vector buffer has to be increased again...), and your container may use much more memory than is really necessary. Again: this might not be a problem, it depends on your application and what T actually is. But as a user of your RandomList interface, I would be bothered to hear that it uses std::vector internally...

So if you are just asking for "another idea" and also want to allow for multiple insertions of the same value into the container - here is one: in essence, make RandomList into a std::map wrapper. The randomization should be implemented by properly employing the map key. To iterate the container you simply use the iterator of std::map. This actually gives you references to (key,value) pairs. You could use the std::map key to implement additional features that might be interesting. First of all, you can hide the key type - and therefore hide details of your implementation - by introducing a typedef RandomList::handle. I recommend that the method insert actually returns a RandomList::handle. In this way, you allow users to directly access specific container values by adding a method access to your interface. access takes RandomList::handle as argument and returns a reference to the corresponding map value. If you alter the interface in this way, it is consistent that your RandomList iterator (which is just the map iterator) refers to pairs of RandomList::handle and T. Whether this is useful or not largely depends on what T is going to be.

erase should take RandomList::handle as argument. Again: if multiple insertions are not welcome, the idea to base your implementation on std::map is problematic, or at least should be approached in a different manner.

This approach allows for a neat control of the "randomization implementation". For example, if you use std::list with random insertions at front or back, the implementation of the randomization is very closely tied to your internal storage/container implementation. An implementation on the basis of std::map could keep the details of the randomization more seperate from the rest of the implementation, as the randomization is fully controlled in terms of the map key selection. For example, if you use an int as key, you could hold a counter internally which is incremented on subsequent insertions. The current value of the counter is used as key for the next map insertion. As this would lead to a completely unbalanced tree structure, RandomList would behave like an ordinary std::list on iterations. I would suggest to choose new key values in such a manner that the tree is perfectly balanced. In this way, the direct access feature is implemented most efficiently (cause search operations are fast), and a straightforward iteration of the RandomList/std::map via begin()/end() should lead to a sufficiently "random" result.

Finally, regarding the interface: I suggest that you use perfect forwarding for the insert operation instead of a direct copy. In this way, you avoid unneccessary copy operations when inserting elements into std::map (same story for std::list of course), and you additionally allow for move operations. If you do not want to use perfect forwarding, at least change T to const T&, as another copy operation is required to insert the object into the std::list/std::map container.

Upvotes: 0

Chris Drew
Chris Drew

Reputation: 15334

I'm not sure I fully understand your use case but it is an interesting question nonetheless.

Tony D's suggestion to use a std::vector seemed like a good one. I've put the inserted value at the end and then swapped with a random element:

template<typename T>
class RandomList {
  std::vector<T> list;
  RandomIndex    randomIndex;
public:
  using iterator = typename std::vector<T>::const_iterator;
  iterator begin() { return list.begin(); }
  iterator end() { return list.end(); }

  void insert(const T& t) {
    list.push_back(t);

    auto i = randomIndex(list.size());

    using std::swap;
    swap(list[i], list.back());
  }
};

Where RandomIndex is a (non-templated) helper functor to get a random index:

class RandomIndex {
  std::mt19937 eng;
public:
  RandomIndex() : eng(std::random_device{}()) {}

  size_t operator()(size_t size) {
    auto dist = std::uniform_int_distribution<size_t>{0, size - 1};
    return dist(eng);    
  }
};

I'm not sure about the quality of the randomness but it should be enough to ensure a client can't make any assumptions about the order of the elements.

Upvotes: 1

Related Questions