Ahmed A
Ahmed A

Reputation: 3652

Iterator returned by set_union()

I have the following C++ code using set_union() from algorithm stl:

 9     int first[] = {5, 10, 15, 20, 25};
10     int second[] = {50, 40, 30, 20, 10};
11     vector<int> v(10);
12     vector<int>::iterator it;
13
14     sort(first, first+5);
15     sort(second, second+5);
16
17     it = set_union(first, first + 5, second, second + 5, v.begin());
18
19     cout << int(it - v.begin()) << endl;

I read through the document of set_union from http://www.cplusplus.com/reference/algorithm/set_union/ . I have two questions:

Would really appreciate if someone can shed some light.

Thank you, Ahmed.

Upvotes: 1

Views: 928

Answers (2)

Praetorian
Praetorian

Reputation: 109109

The documentation for set_union states that the returned iterator points past the end of constructed range, in your case to one past the last element in v that was written to by set_union.

This is the reason it - v.begin() results in the length of the set union also. Note that you are able to simply subtract the two only because a vector<T>::iterator must satisfy the RandomAccessIterator concept. Ideally, you should use std::distance to figure out the interval between two iterators.

Your code snippet can be written more idiomatically as follows:

int first[] = {5, 10, 15, 20, 25};
int second[] = {50, 40, 30, 20, 10};

std::vector<int> v;
v.reserve(10);  // reserve instead of setting an initial size

sort(std::begin(first), std::end(first));       
sort(std::begin(second), std::begin(second));
// use std::begin/end instead of hard coding length

auto it = set_union(std::begin(first), std::end(first), 
                    std::begin(second), std::end(second), 
                    std::back_inserter(v));
// using back_inserter ensures the code works even if the vector is not 
// initially set to the right size

std::cout << std::distance(v.begin(), it) << std::endl;
std::cout << v.size() << std::endl;
// these lines will output the same result unlike your example

In response to your comment below

What is the use of creating a vector of size 10 or reserving size 10

In your original example, creating a vector having initial size of at least 8 is necessary to prevent undefined behavior because set_union is going to write 8 elements to the output range. The purpose of reserving 10 elements is an optimization to prevent possibility of multiple reallocations of the vector. This is typically not needed, or feasible since you won't know the size of the result in advance.

I tried with size 1, works fine

Size of 1 definitely does NOT work fine with your code, it is undefined behavior. set_union will write past the end of the vector. You get a seg fault with size 0 for the same reason. There's no point in speculating why the same thing doesn't happen in the first case, that's just the nature of undefined behavior.

Does set_union trim the size of the vector, from 10 to 8. Why or is that how set_union() works

You're only passing an iterator to set_union, it knows nothing about the underlying container. So there's no way it could possibly trim excess elements, or make room for more if needed. It simply keeps writing to the output iterator and increments the iterator after each write. This is why I suggested using back_inserter, that is an iterator adaptor that will call vector::push_back() whenever the iterator is written to. This guarantees that set_union will never write beyond the bounds of the vector.

Upvotes: 1

Kevin
Kevin

Reputation: 247

first: "it" is an iterator to the end of the constructed range (i.e. equivalent to v.end())

second: it - v.begin() equals 8 because vector iterators are usually just typedefed pointers and therefore it is just doing pointer arithmetic. In general, it is better to use the distance algorithm than relying on raw subtraction

cout << distance(v.begin(), it) << endl;

Upvotes: 0

Related Questions