Setsuna
Setsuna

Reputation: 43

C programming: counts the number of characters, words and lines from another text file

I've just started learning C language and as the topic says, I have to write a code that will read another text file and count the number of "characters", "words" and "sentences" until EOF is reached. My current problems is that I'm not able to produce the right output.

For example a text file containing the following contents...

the world 
is a great place.
lovely
and wonderful

should output with 39 characters, 9 words and 4 sentences and somehow I get 50(characters) 1(words) 1(sentences)

This is my code:

#include <stdio.h>

int main()
{
int x;
char pos;
unsigned int long charcount, wordcount, linecount;

charcount = 0;
wordcount = 0;
linecount = 0;

while(pos=getc(stdin) != EOF)
{
    if (pos != '\n' && pos != ' ')
    {
    charcount+=1;
    }

    if (pos == ' ' || pos == '\n')
    {
    wordcount +=1;  
    }

    if (pos == '\n')
    {
    linecount +=1;
    }

}

    if (charcount>0)
    {
    wordcount+=1;
    linecount+=1;
    }

printf( "%lu %lu %lu\n", charcount, wordcount, linecount );
return 0;
}

Thanks for any sort of help or suggestion

Upvotes: 0

Views: 2685

Answers (1)

chux
chux

Reputation: 153458

Due to operator precedence, the 2 below lines are the same.

// Not what OP needs
pos=getc(stdin) != EOF
pos=(getc(stdin) != EOF)

Instead, use ()

while((pos=getc(stdin)) != EOF) 

Use int ch to distinguish the values returned from fgetc() which are values in the unsigned char range and EOF. Typically 257 different, too many for a char.

int main() {
  unsigned long character_count = 0;
  unsigned long word_count = 0;
  unsigned long line_count = 0;
  unsigned long letter_count = 0;
  int pos;

  while((pos = getc(stdin)) != EOF) {
    ...

You may want to review your word count strategy too. @Tony Tannous


For me, I would count a "word" as any time a letter occurred that did not follow a non-letter. This avoids a problem @Tony Tannous and other issues. Like-wise, I would count a line as any character that followed a '\n' or the very first one and avoid any post loop calculation. This handles the issue commented by Weather Vane.

It also appear 39 is a letter count and not a character count @BLUEPIXY.
Suggest using <ctype.h> functions to test for letter-ness (isapha())

int previous = '\n';
while((pos = getc(stdin)) != EOF) {
  character_count++;
  if (isalpha(pos)) {
    letter_count++;
    if (!isalpha(previous)) word_count++;
  }
  if (previous == '\n') line_count++;
  previous = pos;
}

printf("characters %lu\n", character_count);
printf("letters %lu\n", letter_count);
printf("words %lu\n", word_count);
printf("lines %lu\n", line_count);

Upvotes: 2

Related Questions