Reputation: 21351
I currently have some code where I am using a vector
of pair<string,string>
. This is used to store some data from XML parsing and as such, the process is quite slow in places. In terms of trying to speed up the entire process I was wondering if there would be any performance advantage in switching from vector<pair<string,string> >
to std::map<string,string>
? I could code it up and run a profiler, but I thought I would see if I could get an answer that suggests some obvious performance gain first. I am not required to do any sorting, I simply add items to the vector, then at a later stage iterate over the contents and do some processing - I have no need for sorting or anything of that nature. I am guessing that perhaps I would not get any performance gain, but I have never actually used a std::map
before so I don't know without asking or coding it all up.
Upvotes: 4
Views: 3407
Reputation: 6914
As C++ say std::vector
sort items in a linear memory, so first it allocate a memory block with an initial capacity and then when you want to insert new item into vector it will check if it has more room or not and if not it will allocate a new buffer with more space, copy construct all items into new buffer and then delete source buffer and set it to new one.
When you just start inserting items into vector
and you have lot of items you suffer from too many reallocation, copy construction and destructor call.
In order to solve this problem, if you now count of input items (not exact but its usual length) you can reserve
some memory for the vector and avoid reallocation and every thing.
if you have no idea about the size you can use a collection like std::list
witch never reallocate its internal items.
Upvotes: 0
Reputation: 7357
If you are not modifying your vector<pair<string,string> >
- just iterating it over and over - you will get perfomance degradation by using map
. This is because typical map
is organized with binary tree of objects, each of which can be allocated in different memory blocks (unless you write own allocator). Plus, each node of map
manages pointers to neighbor objects, so it's time and memory overhead, too. But, search by key is O(log) operation. On other side, vector
holds data in one block, so processor cache usually feels better with it. Searching in vector is actually O(N) operation which is not so good but acceptable. Search in sorted vector can be upgraded to O(log) using lower_bound etc functions.
It depends on operations you doing on this data. If you make many searches - probably its better to use hashing container like unordered_map
since search by key in this containers is O(1) operation. For iterating, as mentioned, vector
is faster.
Probably it is worth to replace string
in your pair
, but this highly depends on what you hold there and how access container.
Upvotes: 6
Reputation: 3861
If your usage pattern is such that you perform many insertions before performing any lookups, then you might benefit from implementing a "lazy" map where the elements are sorted on demand (i.e. when you acquire an iterator, perform a lookup, etc).
Upvotes: 1
Reputation: 153840
The answer depends on what you are doing with these data structures and what the size of them is. If you have thousands of elements in your std::vector<std::pair<std::stringm std::string> >
and you keep searching for the first
element over and over, using a std::map<std::string, std::string>
may improve the performance (you might want to consider using std::unordered_map<std::string, std::string>
for this use case, instead). If your vectors are relatively small and you don't trying to insert elements into the middle too often, using vectors may very well be faster. If you just iterate over the elements, vectors are a lot faster than maps: iterations isn't really one of their strength. Maps are good at looking things up, assuming the number of elements isn't really small because otherwise a linear search over a vector is still faster.
The best way to determine where the time is spent is to profile the code: it is often not entirely clear up front where the time is spent. Frequently, the suspected hot-spots are actually non-problematic and other areas show unexpected performance problems. For example, you might be passing your objects my value rather than by reference at some obscure place.
Upvotes: 5
Reputation: 239311
No. If (as you say) you are simply iterating over the collection, you will see a small (probably not measurable) performance decrease by using a std::map
.
Maps are for accessing a value by its key. If you never do this, map is a bad choice for a container.
Upvotes: 11