Reputation: 2611
I am looking for the correct data structure to store the number of characters of a given type in an incoming stream, letter by letter. I know the size of the alphabet in advance (about 10), but the stream will be about a 1gb. Main criteria is rapid access. Could use a list with an appropriately chosen enum
to make things clearer, but is this the best way?
Upvotes: 0
Views: 108
Reputation: 30606
Given the performance requirements, consider a layout that is contiguous in memory; thus helping reduce the cache misses.
Something like;
const std::size_t SIZE = 10;
int count[SIZE] = {};
// or
std::vector<int> count(SIZE, 0);
If you need to store the count together with the character, then a "pair" may help;
struct Datum {
Datum() : c('\0'), count(0) {}
char c; // assuming the "alphabet" is in the char range
int count;
};
std::vector<Datum> count(SIZE);
Herb Sutter and Bjarne offer some material and empirical evidence as to why std::vector
should be favoured. As always, measurements should be made to verify the performance given your data structure, algorithm and associated data accesses etc.
Upvotes: 4
Reputation: 499
For storing you can try make alphabet encoding table and simple char array (char is enough to store 1 of 10 different characters). Like:
map<int, char> m;
m['A'] = 1;
m['B'] = 2;
...
char data[SIZE];
for(int i = 0; i < SIZE; i++){
int ch = read();
data[i] = m[ch];
}
Or compress 2 items into one char.
Upvotes: 0
Reputation: 41347
A simple array would work best:
int counters[SIZE_OF_ALPHABET];
Upvotes: 0