Reputation: 382
I am reading a file word by word and I want to first preprocess the word by converting all characters to lower case and remove any non-alphabetic characters except for the punctuation marks:hyphen (-) and apostrophe(‘) and then display accordingly. I have completed converting words into lowercase
but now I need to remove non-alphabetic character except hyphen(-) and apostrophe('). I have no idea how to do that. Can someone please figure this one out?
void WordStats::ReadTxtFile()
{
std::ifstream ifile(Filename);
if(!ifile)
{
std::cerr << "Error Opening file " << Filename << std::endl;
exit(1);
}
for (std::string word; ifile >> word; )
{
transform (word.begin(), word.end(), word.begin(), ::tolower);
WordMap & Words = (Dictionary.count(word) ? KnownWords :
UnknownWords);
Words[word].push_back(ifile.tellg());
}
std::cout << KnownWords.size() << " known words read." << std::endl;
std::cout << UnknownWords.size() << " unknown words read." << std::endl;
}
Upvotes: 0
Views: 189
Reputation: 310
You could use std::remove_if
word.erase(std::remove_if(word.begin(), word.end(), [](char c)
{
return (c < 'a' || c > 'z') && c != '\'' && c != '-';
}), word.end());
Upvotes: 1