Reputation: 23
I needed to make a program to count the number of words, sentences and letters by getting an input from the user. The program works perfectly until the input i give is multi-lined. If the input is longer than the text that can fit in the terminal window, the program starts to ignore all full stops/question marks/exclamation marks. I don't know why, and I'd like some help. This doesn't happen if the text can fit in one line of the terminal window. I also printed every character when it's read by the program, but that also ignores all full stops/question marks/ exclamation marks. None of those characters get printed. For clarification, a sentence is just the number of full stops/question marks/ exclamation marks, number of words is just the number of spaces in the text plus 1. Here is my code:
#include <stdio.h>
#include <ctype.h> //for the isalpha() function
#include <cs50.h> //for the get_string() function
int main(void)
{
int sentences = 0, letters = 0;
int words = 1;
char character;
string text = get_string("Enter Text: \n");
char x = 0;
while (text[x] != '\0')
{
character = text[x];
switch (character)
{
case ' ':
words++;
break;
case '.':
sentences++;
break;
case '?':
sentences++;
break;
case '!':
sentences++;
break;
default:
if (isalpha(character))
{
letters++;
}
}
x++;
}
printf("\n");
printf("WORDS: %d, LETTERS: %d, SENTENCES: %d\n", words, letters, sentences);
}
I'm fairly new to c, but I have around a year of experience in Python. Thank you for your time.
Upvotes: 0
Views: 936
Reputation: 123598
I’m going to make a few suggestions.
First, don’t use get_string
1 (or scanf
, or fgets
). For a filter program like this, you don’t actually need to store the input in order to process it; use getchar
(or fgetc
) to read one character at a time and loop based on that:
int c; // getchar returns int, not char
...
puts( "Enter Text:" );
while ( ( c = getchar() ) != EOF )
{
// test c instead of text[x]
}
This approach will handle input of any length (such as if you redirect a file as your input), and it avoids the potential overflow issue Weather Vane identified in the comments. The downside is that you’ll have to manually signal EOF
from the console for interactive input (using either Ctrl-z or Ctrl-d depending on your platform).
You can collapse some of your tests in your switch
, such as
case '.' : // Each of these cases "falls through"
case '!' : // to the following case.
case '?' :
words++; // the end of a sentence is also the end of a word
sentences++;
break;
You’ll want to add cases to handle newlines and tabs:
case ' ' :
case '\n' :
case '\t' :
words++;
break;
except you don’t want to bump the words
counter for repeating whitespace characters, or if the previous non-whitespace character was a punctuation character. So you’ll want an extra variable to track the class of the previously-read character:
enum {NONE, TEXT, PUNCT, WHITE} class = NONE;
...
while ( ( c = getchar() ) != EOF )
{
switch( c )
{
case ' ' :
case '\n' :
case '\t' :
if ( class == TEXT )
words++;
class = WHITE;
break;
case '.' :
case '!' :
case '?' :
if ( class == TEXT ) // Don’t bump the word counter
words++; // if the previous character was
// was whitespace or .! ?
if ( class != PUNCT ) // Don’t bump the sentence counter
sentences++; // for repeating punctuation
class = PUNCT;
break;
...
}
}
There will still be weird corner cases where this won’t give a completely accurate count, but should be good enough for most input.
You should be able to figure out the rest from there.
get_string
are pretty slick, but they grossly misrepresent how C actually does things. The string
typedef is especially egregious because what it aliases is not a string. Just be aware these tools will not be available outside the CS50 curriculum, so don’t become too reliant on them.
Upvotes: 1