Hofbr
Hofbr

Reputation: 1010

How to input a stream that will ignore numbers and treat all symbols as delimiters

I have a binary tree that takes in a stream of words and adds them each to a new node of the binary tree, skipping duplicates.

I tried implementing an input method like so:

void WordTree::input(std::istream& in) {
   std::string str;
   
   while (in >> str) {
      // convert to lowercase
      std::transform(str.begin(), str.end(), str.begin(), ::tolower);
      str.erase(std::remove_if(str.begin(), str.end(), ::isdigit), str.end());
      str.erase(std::remove_if(str.begin(), str.end(), ::ispunct), str.end());
      // add to tree
      if (str != " ")
         add(str);
   }
}

But for a given test input like std::string str_input = " 100 2/3 alakazam qwerty up-time level up up up up bastion how are you 23 ALAKAZAM"

I'm getting the results as:

 occurrences: 3
alakazam occurrences: 2
are occurrences: 1
bastion occurrences: 1
how occurrences: 1
level occurrences: 1
qwerty occurrences: 1
up occurrences: 4
uptime occurrences: 1
you occurrences: 1

Unique word count: 10

It appears to be counting spaces as inputs? Even though I have a conditional to ignore any spaces. I also have the problem that I need an input of "up-time" to be two words like such that "up" and "time" are their own words.

I tried implementing the method with characters instead for more control, but it so far just mushes the whole string together.

void WordTree::input(std::istream& in) {

   char c;
   std::string str;

   
   while (in >> c) {
      if (std::isalpha(c)) {
         str = str + c;
      }
      else if (str != "") {
         std::transform(str.begin(), str.end(), str.begin(), ::tolower);
         add(str);
         str = "";
      }
   }
}

The output from my display method (which just traverses the tree returns the word in the node) looks like:

alakazamqwertyup occurrences: 1
timelevelupupupupbastionhowareyou occurrences: 1

Upvotes: 2

Views: 36

Answers (0)

Related Questions