Reputation: 589
I'm curious as to if there is a way to find an exact word by itself even if the search is consisted within a word. As you can see below, the output is stating that the word 'day' is found twice, but that's only because 'day' is also used to spelt 'today'. I would like the search to specifically look for the word 'day' and count that even though it's found in 'today.
Is this possible?
Note: The assignment would like for us to use string manipulators
//search for particular word - member function
std::cout << "Please indicate a word which you like to be found in the paragraph you entered: ";
getline(std::cin, searchWord);
//pos determines the position in the array it's in if the word is found and goes until the end of string.
size_t pos = 0;
int wordCount = 0;
//npos = not found OR -1.
while (( pos = userParagraph.find(searchWord, pos)) != std::string::npos) {
++pos;
++wordCount;
}
if (wordCount == 0) {
std::cout << "The word you entered, '" << searchWord << "', was not found." << std::endl << std::endl;
}
else {
std::cout << searchWord << " was Found " << wordCount << " times." << std::endl << std::endl;
}'
Upvotes: 1
Views: 279
Reputation: 60228
If you find a word, you can check if the adjacent characters are alphabets, using std::isalpha
, and only count it if they are not alphabets.
while (( pos = userParagraph.find(searchWord, pos)) != std::string::npos) {
if ((pos == 0 || !std::isalpha(userParagraph[pos - 1]))
&& (pos + searchWord.size() == userParagraph.size()
|| !std::isalpha(userParagraph[pos + searchWord.size()]))
++wordCount;
++pos;
}
and now the word won't be counted if it's part of another word.
Note that the additional checks are needed to make sure that you don't index into an invalid position of the string.
Upvotes: 1
Reputation: 73376
Yes, this is possible. But it requires you to decide what are word boundaries. For example, is '-' a word boundary like a space? Or would you consider it as a letter?
You may for example filter out non-words, by checking if the found string:
It looks like this:
while (( pos = userParagraph.find(searchWord, pos)) != std::string::npos) {
bool wstart = pos==0 || !isalpha(userParagraph[pos-1]);
bool wend = pos+searchWord.size()==userParagraph.size()
|| !isalpha(userParagraph[pos+searchWord.size()]);
if (wstart && wend)
++wordCount;
++pos;
}
Caution: this works with single char encoding only. With UTF8, it would fail for languages that uses letters that are not in the ascii alphabet (e.g. accentuated letters, like é, ñ, ä, ... would be misinterpreted as valid word separators)
Upvotes: 2