mathematician1975
mathematician1975

Reputation: 21351

Would I see a performance gain using std::map instead of vector<pair<string, string> >?

I currently have some code where I am using a vector of pair<string,string>. This is used to store some data from XML parsing and as such, the process is quite slow in places. In terms of trying to speed up the entire process I was wondering if there would be any performance advantage in switching from vector<pair<string,string> > to std::map<string,string> ? I could code it up and run a profiler, but I thought I would see if I could get an answer that suggests some obvious performance gain first. I am not required to do any sorting, I simply add items to the vector, then at a later stage iterate over the contents and do some processing - I have no need for sorting or anything of that nature. I am guessing that perhaps I would not get any performance gain, but I have never actually used a std::map before so I don't know without asking or coding it all up.

Upvotes: 4

Views: 3407

Answers (5)

BigBoss
BigBoss

Reputation: 6914

As C++ say std::vector sort items in a linear memory, so first it allocate a memory block with an initial capacity and then when you want to insert new item into vector it will check if it has more room or not and if not it will allocate a new buffer with more space, copy construct all items into new buffer and then delete source buffer and set it to new one.

When you just start inserting items into vector and you have lot of items you suffer from too many reallocation, copy construction and destructor call.

In order to solve this problem, if you now count of input items (not exact but its usual length) you can reserve some memory for the vector and avoid reallocation and every thing. if you have no idea about the size you can use a collection like std::list witch never reallocate its internal items.

Upvotes: 0

Galimov Albert
Galimov Albert

Reputation: 7357

If you are not modifying your vector<pair<string,string> > - just iterating it over and over - you will get perfomance degradation by using map. This is because typical map is organized with binary tree of objects, each of which can be allocated in different memory blocks (unless you write own allocator). Plus, each node of map manages pointers to neighbor objects, so it's time and memory overhead, too. But, search by key is O(log) operation. On other side, vector holds data in one block, so processor cache usually feels better with it. Searching in vector is actually O(N) operation which is not so good but acceptable. Search in sorted vector can be upgraded to O(log) using lower_bound etc functions.

It depends on operations you doing on this data. If you make many searches - probably its better to use hashing container like unordered_map since search by key in this containers is O(1) operation. For iterating, as mentioned, vector is faster.

Probably it is worth to replace string in your pair, but this highly depends on what you hold there and how access container.

Upvotes: 6

Andrew Durward
Andrew Durward

Reputation: 3861

If your usage pattern is such that you perform many insertions before performing any lookups, then you might benefit from implementing a "lazy" map where the elements are sorted on demand (i.e. when you acquire an iterator, perform a lookup, etc).

Upvotes: 1

Dietmar K&#252;hl
Dietmar K&#252;hl

Reputation: 153840

The answer depends on what you are doing with these data structures and what the size of them is. If you have thousands of elements in your std::vector<std::pair<std::stringm std::string> > and you keep searching for the first element over and over, using a std::map<std::string, std::string> may improve the performance (you might want to consider using std::unordered_map<std::string, std::string> for this use case, instead). If your vectors are relatively small and you don't trying to insert elements into the middle too often, using vectors may very well be faster. If you just iterate over the elements, vectors are a lot faster than maps: iterations isn't really one of their strength. Maps are good at looking things up, assuming the number of elements isn't really small because otherwise a linear search over a vector is still faster.

The best way to determine where the time is spent is to profile the code: it is often not entirely clear up front where the time is spent. Frequently, the suspected hot-spots are actually non-problematic and other areas show unexpected performance problems. For example, you might be passing your objects my value rather than by reference at some obscure place.

Upvotes: 5

user229044
user229044

Reputation: 239311

No. If (as you say) you are simply iterating over the collection, you will see a small (probably not measurable) performance decrease by using a std::map.

Maps are for accessing a value by its key. If you never do this, map is a bad choice for a container.

Upvotes: 11

Related Questions