ThePancakerizer
ThePancakerizer

Reputation: 1

Counting how many times certain words show up in a text file in C++

I am trying to make a program with two different text files. One of them contains the actual text I want to analyze and the other contains a list of words. The program is supposed to check when a word from the list shows up in the text and count that. Here is the (non working) code I have so far:

#include <iostream>
#include <string>
#include <fstream>

using namespace std;

int main () {

    string word1;
    string word2;
    int listHits = 0;

    ifstream data1 ("text.txt");
    if ( ! data1 ) {
    cout << "could not open file: " << "text.txt" << endl;
        exit ( EXIT_FAILURE );
  }

    ifstream data2 ("list.txt");
    if ( ! data2 ) {
    cout << "could not open file: " << "list.txt" << endl;
        exit ( EXIT_FAILURE );
  }

    while ( data1 >> word1 ) {
        while ( data2 >> word2 ) {
            if ( word1 == word2 ) {
                listHits++;
            }
        }
    }

    cout << "Your text had " << listHits << " words from the list " << endl;

    system("pause");

    return 0;
}

If text.txt contains

Here is a text. It will be loaded into a program.

and list.txt contains

will a

the expected outcome is 3. However, no matter what is in the text files the program always gives me the answer 0. I have checked that the program actually manages to read the files by having it count the times it does the loops, and it works.

Thanks in advance

Upvotes: 0

Views: 1387

Answers (2)

Sergey Kalinichenko
Sergey Kalinichenko

Reputation: 726799

Your program goes through the "list of target words" (i.e. data2) file only once. File streams are "one way": once you exhaust it, you need to rewind it, or it's going to stay at the end. The inner loop

while ( data2 >> word2 )
    ...

is going to execute only the first time through, i.e. for the first word of data1. For the second and all the subsequent words, the data2 will already be at the end of file, so the code will not even enter the loop.

You should read your target words in memory, and use that list in the inner loop. Better yet, put the words in a set<string>, and use that set to do your counting.

Upvotes: 1

Alon
Alon

Reputation: 1804

It seems to me that you're always only comparing the first letter of the first file to the entire second file, you do:

  while ( data1 >> word1 ) {
        while ( data2 >> word2 ) { // <---- after this ends the first time, it will never enter again
            if ( word1 == word2 ) {
                listHits++;
            }
        }

you need to "reset" data2 after the second loop finished so it starts to read again from the beginning of the file:

 while ( data1 >> word1 ) {
        while ( data2 >> word2 ) {
            if ( word1 == word2 ) {
                listHits++;
            }    
        }
        data2.seekg (0, data2.beg);
   }

Upvotes: 1

Related Questions