Anne van Rossum
Anne van Rossum

Reputation: 3149

How to circumvent iteration over an output iterator?

The algorithm I implemented below is the well-known Robert Floyd algorithm that returns M random numbers out of an array of N numbers in total. The algorithm returns a set of elements, but within the algorithm you will need to loop over this result set to check if a previously found element has already been added to the result set before.

It is not possible to loop over the output iterator, because it states in the documentation that an output iterator should only be dereferenced once.

template<typename Iter, typename RandomGenerator>
Iter random_element(Iter start, Iter end, RandomGenerator& g) {
    if (start == end) return start;
    std::uniform_int_distribution<> dis(0, std::distance(start, end) - 1);
    std::advance(start, dis(g));
    return start;
}

template<typename Iter>
Iter random_element(Iter start, Iter end) {
    static std::random_device rd;
    static std::mt19937 gen(rd());
    return random_element(start, end, gen);
}

//! @brief Algorithm of Robert Floyd.
template<typename InputIterator, typename OutputIterator>
OutputIterator random_n(InputIterator first, InputIterator last, OutputIterator result, size_t number) {
    // "misuse" the glibc functions to enforce the notions conform to the documentation
    typedef typename std::iterator_traits<InputIterator>::value_type ValueType;
    __glibcxx_function_requires(_InputIteratorConcept<InputIterator>);
    __glibcxx_function_requires(_OutputIteratorConcept<OutputIterator, ValueType>);
    __glibcxx_requires_valid_range(first1, last1);

    if (first == last) return result;
    if (number == 0) return result;
    assert (number <= (last - first));

    // create container to store distances, not the value itself, neither the iterator values
    std::vector<size_t> distance;
    InputIterator j = last - number + 1;

    // in the case of number=1, j will need to be the end of the array, so full array is searched
    while (j <= last) {
        InputIterator rand_index = random_element(first,j);
        size_t rand = std::distance(first, rand_index);
        if (std::find(distance.begin(), distance.end(), rand) != distance.end()) {
            distance.push_back(std::distance(first,j) - 1);
        } else {
            distance.push_back(rand);
        }
        ++j;
    }
    // fill result container
    for (size_t i = 0; i < distance.size(); ++i) {
        *result = *(first+distance[i]);
        ++result;
    }
    return result;
}

The current solution creates a temporary vector that stores the distances with respect to the iterator first and finally fills the result array in one go, using these distances. It looks ugly to me though. Is there maybe some special iterator construct that is used to cope with the fact that you cannot loop multiple times over an output iterator?

Upvotes: 0

Views: 283

Answers (2)

Potatoswatter
Potatoswatter

Reputation: 137810

Your function can do whatever it wants. Then you need to specify to the user how the templated arguments are used.

The name OutputIterator is just an identifier; it doesn't introduce any restrictions or capabilities from the standard. It is a form of documentation, so if you make a second pass and use the iterator as an input, OutputIterator would be a misleading name.

According to the Standard, ForwardIterator requires the multi-pass guarantee, that you can keep a previous value of the iterator and read its referenced object multiple times, and still keep getting the same value, and furthermore that underlying sequence still exists. All this seems necessary and sufficient for your purpose. So you might call the template parameter ForwardIterator. But it's still just a name. Until a more stringent system is implemented, C++ templates use duck typing.

The Standard suggests and names certain common interfaces, but anything goes.

Upvotes: 1

You can tighten the requirements of your algorithm and require a ForwardIterator to point to the output.

Upvotes: 3

Related Questions