zapper
zapper

Reputation: 73

how to store contents in C++ vector

so pretty much every problem on an Online judge requires the basic work of storing the a bunch of values and then processing them. I normally use std::vector for these, but i am concerned that using push_back() for storing each element is bad for performance.

Currently i do this :

vector<int> vec;
int x;
for(int i=0;i<10;i++)
{
 cin>>x;
 vec.push_back(x);
}

but then i though if this would be better

vector<int> vec;
int x;
vec.reserve(10);
for(int i=0;i<10;i++)
 cin>>vec[i];

Which is more suitable? , Is there any difference in performance? And let's say i have a good reason to avoid C style arrays and yeah, i know the initial size of vector beforehand .

Upvotes: 0

Views: 2320

Answers (5)

nh_
nh_

Reputation: 2241

If you want to store more complex objects than just plain ints the fastest solution should be to create the objects in place (inside the vector) using emplace_back(), if your compiler supports that. Reserving enough space beforehand will prevent multiple reallocations.

class Widget
{
public:
    Widget(const std::string& name, int x, int y);
};

std::vector<Widget> widgets;
widgets.reserve(nWidgets);
widgets.emplace_back("w1", 0, 0);
widgets.emplace_back("w2", 100, 20);
// ...

Upvotes: 0

Matteo Italia
Matteo Italia

Reputation: 126957

First of all, careful, in your second example there's a glaring mistake: reserve expands the capacity of the vector, i.e. the number of elements you can put in it without vector performing a reallocation, but does not affect its logical size. For this reason, if you do reserve you still have to do push_back, otherwise you are accessing logically-nonexistent elements in your loop. What you probably meant was resize, which expands both the logical size and the capacity to the requested size.

Now, coming to performance:

Plain push_back vs reserve + push_back: mostly the same (if copy is cheap)

Even without reserve, the vector will reach the needed capacity in amortized constant time (actually in O(log N) time, which is hidden anyway in the O(N) of the loop).

OK, if you already know for certain the size that your vector will take it will avoid reallocations, but don't jump through hoops to determine how much to reserve.

Exception: types with an expensive copy constructor (move constructor in C++11). If you are storing objects that are expensive to copy/move, you'll want to avoid reallocations, so reserve may help here (although usually you store such types by pointer, avoiding the problem).

resize + operator[]: faster for "simple" types

What I actually saw that boosts slightly the performance when dealing with simple types (typically PODs, or in general stuff with extremely simple constructor/assignment operator) is to do a resize beforehand, and then just do assignments through the [] operator.

This avoids the additional complexity of push_back, which has to check the capacity and increment the "logical size"; assignment through the subscript operator, OTOH, when optimized resolves to a handful of assembly instructions.

Of course you don't want to resize and assign when you have complex types, where the initial default construction and the assignment offset the light bookkeeping that push_back has to do.

Upvotes: 2

Jerry Coffin
Jerry Coffin

Reputation: 490728

I think I'd prefer option C:

const int num = 10;

std::vector<int> vec;
vec.reserve(num);

std::copy_n(std::istream_iterator<int>(std::cin), num, std::back_inserter(vec));

...but if you place a high value on the code being short, perhaps you'd prefer D:

std::vector<int> vec(num);

std::copy_n(std::istream_iterator<int>(std::cin), num, vec.begin());

Personally, I don't like that quite as well (it's a bit less idiomatic, IMO) but it'll work perfectly well, and it is clearly shorter.

Upvotes: 2

Marson Mao
Marson Mao

Reputation: 3035

vector automatically grows when it's capacity is not enough, so push_back is not a good idea if you knows the size prior.

Your second approach is better, but there is a problem: reserve and resize are not the same.

You have 2 ways to do it more efficiently:

1.

vector<int> vec;
vec.resize(10); // actually change the size, so your vec is now of size 10
for (int i = 0; i < 10; ++i) // pre-increment is better
    cin >> vec[i];

2.

vector<int> vec;
vec.reserve(10); // only enlarge the capacity, but size is still 0
int x;
for (int i = 0; i < 10; ++i) // pre-increment is better
{
    cin >> x;
    vec.push_back(x); // would not trigger any memory reallocation, performance is fine
}

To be more detailed:

capacity and size:

capacity indicates the memory allocated for vector, so when this number is exceeded, the whole vector is copied to a new memory space, and this is not efficient, so try to avoid this as possible.

size is the number of current elements, it is <= capacity, whenever you do push_back, this number is increased by 1.

reserve and resize:

reserve changes the capacity, you tell the vector to "reserve" how many spaces for potential use, so if you can know your size, then call this function to make sure you can avoid memory reallocate. This function only change the capacity, but no touching size.

resize immediately change the vector's size, it can also incur memory allocation. If your new size is larger than old size, the newly added element are default initialized.

Refer here for full information:

http://www.cplusplus.com/reference/vector/vector/?kw=vector

Upvotes: 0

Read documentation of std::vector. You'll notice that reserve is just for

Request a change in capacity

Requests that the vector capacity be at least enough to contain n elements.

If n is greater than the current vector capacity, the function causes the container to reallocate its storage increasing its capacity to n (or greater).

In all other cases, the function call does not cause a reallocation and the vector capacity is not affected.

This function has no effect on the vector size and cannot alter its elements.

So you can't store elements after a reserve, you need to resize first.

vec.resize(10);
for(int i=0;i<10;i++)
 cin>>vec[i];

BTW, when you want to push some elements and you have an idea of how much elements, you should better vec.reserve(10) before:

vec.reserve(10);
for(int i=0;i<10;i++) {
 cin>>x;
 vec.push_back(x);
}

About performance issues: in your particular case you might not care (since I/O is much slower than vector functions). In general, you should try to reserve before a lot of push_back, since a push_back would internally do some reserve equivalent (re-allocation of some unknown implementation specific amount) if space is not enough. Perhaps not using reserve might trigger reallocation periodically, and you might want to avoid that.

I bet that storing in vec[i] after a suitable resize is the fastest, since operator [] is documented as:

Returns a reference to the element at position n in the vector container.

A similar member function, vector::at, has the same behavior as this operator function, except that vector::at is bound-checked and signals if the requested position is out of range by throwing an out_of_range exception.

Portable programs should never call this function with an argument n that is out of range, since this causes undefined behavior.

BTW, most STL implementations (including libstdc++ from GCC...) are somehow free software, so you could study their source code to understand what exactly is done, and you could always benchmark.

Upvotes: 2

Related Questions