Reputation: 58
I am making a small project program that involves inputting quotes that would be later saved into a database (in this case a .txt file). There are also commands that the user would input such as list (which shows the quote by author) and random (which displays a random quote).
Here's the structure if I would use a map (with the author string as the key):
struct Information{
string quoteContent;
vector<string> tags;
}
and here's the structure if I would use the vector instead:
struct Information{
string author;
string quoteContent;
vector<string> tags;
}
note: The largest largest number of quotes I've had in the database is 200. (imported from a file)
I was just wondering which data structure would yield better performance. I'm still pretty new to this c++ thing, so any help would be appreciated!
Upvotes: 2
Views: 1508
Reputation: 43014
The golden rule is: "When in doubt, measure."
i.e. Write some tests, do some benchmarking.
Anyway, considering that you have circa 200 items, I don't think there should be an important difference from the two cases on modern PC hardware. Big-O notation matters when N is big (e.g. 10,000s, 100,000s, 1,000,000s, etc.)
vector
tends to be simpler than map
, and I'd use it as the default container of choice (unless your main goal is to access the items given the author's name as a key, in this case map
seems more logically suited).
Another option might be to have a vector
with items sorted using author's names, so you can use binary search (which is O(logN)) inside the vector
.
Upvotes: 0
Reputation: 106236
For your data volumes it obviously doesn't matter from a performance perspective, but multi_map
will likely let you write shorter, more comprehensible and maintainable code. Regarding general performance of vector vs maps (which is good to know about but likely only becomes relevant with millions of data elements or low-latency requirements)...
vector
doesn't do any automatic sorting for you, so you'd probably push_back
quotes as you read them, then do one std::sort
once the data's loaded, after which you can find elements very quickly by author with std::binary_search
or std::lower_bound
, or identify insertion positions for new quotes using e.g. std::lower_bound
, but if you want to insert a new quote thereafter you have to move the existing vector elements from that position on out of the way to make room - that's relatively slow. As you're just doing a few ad-hoc insertions based on user input, the time to do that with only a few hundred quotes in the vector will be totally insignificant. For the purposes of learning programming though, it's good to understand that a multimap
is arranged as a kind of branching binary tree, with pointers linking the data elements, which allows for relatively quick insertion (and deletion). For some applications following all those pointers around can be more expensive (i.e. slower) than vector's contiguous memory (which works better with CPU cache memory), but in your case the data elements are all strings and vectors of strings that will likely (unless Short String Optimisations kick in) require jumping all over memory anyway.
In general, if author is naturally a key for your data just use a multi_map
... it'll do all your operations in reasonable time, maybe not the fastest but never particularly slow, unlike vector for post-data-population mid-container insertions (/deletions).
Upvotes: 2
Reputation: 1334
Depends on the purpose of usage. Both data-structures have their pros and cons.
Vectors
Maps:
(use unordered map for better performance than map.)
Use datastructure on basis of what you want to achieve.
Upvotes: 0