Reputation: 2243
Judging from the title, I kinda did my program in a fairly complicated way. BUT! I might as well ask anyway xD
This is a simple program I did in response to question 3-3 of Accelerated C++, which is an awesome book in my opinion.
I created a vector:
vector<string> countEm;
That accepts all valid strings. Therefore, I have a vector that contains elements of strings.
Next, I created a function
int toLowerWords( vector<string> &vec )
{
for( int loop = 0; loop < vec.size(); loop++ )
transform( vec[loop].begin(), vec[loop].end(),
vec[loop].begin(), ::tolower );
that splits the input into all lowercase characters for easier counting. So far, so good.
I created a third and final function to actually count the words, and that's where I'm stuck.
int counter( vector<string> &vec )
{
for( int loop = 0; loop < vec.size(); loop++ )
for( int secLoop = 0; secLoop < vec[loop].size(); secLoop++ )
{
if( vec[loop][secLoop] == ' ' )
That just looks ridiculous. Using a two-dimensional array to call on the characters of the vector until I find a space. Ridiculous. I don't believe that this is an elegant or even viable solution. If it was a viable solution, I would then backtrack from the space and copy all characters I've found in a separate vector and count those.
My question then is. How can I dissect a vector of strings into separate words so that I can actually count them? I thought about using strchr, but it didn't give me any epiphanies.
Solution via Neil:
stringstream ss( input );
while( ss >> buffer )
countEm.push_back( buffer );
From that I could easily count the (recurring) words.
Then I did a solution via Wilhelm that I will post once I re-write it since I accidentally deleted that solution! Stupid of me, but I will post that once I have it written again ^^
I want to thank all of you for your input! The solutions have worked and I became a little better programmer. If I could vote up your stuff, then I would :P Once I can, I will! And thanks again!
Upvotes: 2
Views: 3418
Reputation: 15275
Since C++11 there is a special and very powerful iterator, for iterating over patterns (for example words) in a string: The std::sregex_token_iterator
With that and iterator function std::distance, we can simply count all words (or other patterns in a string, by calculating the distance between the first and the last pattern.
The resulting program is always a one-liner:
#include <iostream>
#include <string>
#include <algorithm>
#include <iterator>
#include <regex>
const std::regex re{R"(\w+)"};
const std::string test{"the quick brown fox jumps over the lazy dog"};
int main()
{
std::cout << std::distance(std::sregex_token_iterator(test.begin(), test.end(), re), {});
}
With this method, we can of course also split the string and show the resulting words:
#include <iostream>
#include <string>
#include <algorithm>
#include <iterator>
#include <regex>
const std::regex re{R"(\w+)"};
const std::string test{"the quick brown fox jumps over the lazy dog"};
int main()
{
std::copy(std::sregex_token_iterator(test.begin(), test.end(), re), {}, std::ostream_iterator<std::string>(std::cout, "\n"));
}
By using the std::vector
s range constructor, we can store also the words in a std::vector
:
#include <iostream>
#include <string>
#include <algorithm>
#include <iterator>
#include <regex>
#include <vector>
const std::regex re{R"(\w+)"};
const std::string test{"the quick brown fox jumps over the lazy dog"};
int main()
{
std::vector<std::string> words(std::sregex_token_iterator(test.begin(), test.end(), re), {});
std::cout << words.size();
}
You see. There are really many possibilities.
If you have a stream, then you can use the std::istream
iterator for the same purpose-
Upvotes: 0
Reputation: 58725
You can use std::istringstream
to extract the words one by one and count them. But this solution consumes O(n) in space complexity.
string text("So many words!");
size_t count = 0;
for( size_t pos(text.find_first_not_of(" \t\n"));
pos != string::npos;
pos = text.find_first_not_of(" \t\n", text.find_first_of(" \t\n", ++pos)) )
++count;
Perhaps not as short as Neil's solution, but takes no space and extra-allocation other than what's already used.
Upvotes: 2
Reputation: 7148
Use a tokenizer such as the one listed here in section 7.3 to split the strings in your vector into single words (or rewrite it so that it just returns the number of tokens) and loop over your vector to count the total number of tokens you encounter.
Upvotes: 1
Reputation:
If the words are always space separated, the easiest way to split them is to use a stringstream:
string words = .... // populat
istringstream is( words );
string word;
while( is >> word ) {
cout << "word is " << word << endl;
}
You'd want to write a function to do this, of course, and apply it to your strings. Or it may be better not to store the strings at allm but to split into words on initial input.
Upvotes: 2