Reputation: 3
So I'm supposed to count how many words there are in a txt file with multiple lines, and words are defined as a continuous sequence of letters (a through z, and A through Z) and the apostrophe seperated by any character outside these ranges.
I've got what I think looks right, but the wordcount keeps on coming out wrong. Does anyone see anything weird about my code?
Please ignore the linecount and charcount, as they are working properly. I tried counting the spaces between the words, with 32 being the ASCII code for a space.
#include <stdio.h>
int main()
{
int c;
int charcount = 0;
int wordcount = 1;
int linecount = 0;
while (c != EOF)
{
c = getchar();
if (c == EOF)
break;
if (c == 10)
linecount++;
charcount++;
if (c == 32)
wordcount++;
}
printf ("%d %d %d\n", charcount, wordcount, linecount);
return 0;
}
So for example, one of the txt files says:
Said Hamlet to Ophelia,
I'll draw a sketch of thee,
What kind of pencil shall I use?
2B or not 2B?
The word count here is 21, but I get a wordcount of 18. I tried counting in the number of "/n" and it works for this test, but it fails for the next test.
Thanks in advance!
Upvotes: 0
Views: 3491
Reputation: 168
Include ctype.h and then change
if (c == 32)
wordcount++
to
if (isspace(c))
wordcount++
Words are separated by spaces, tabs, and line characters.
Upvotes: 1
Reputation: 11438
Use a simple FSM coded in C:
#include <stdio.h>
#include <ctype.h>
enum {INITIAL,WORD,SPACE};
int main()
{
int c;
int state = INITIAL;
int wcount = 0;
c = getchar();
while (c != EOF)
{
switch (state)
{
case INITIAL: wcount = 0;
if (isalpha(c) || c=='\'')
{
wcount++;
state = WORD;
}
else
state = SPACE;
break;
case WORD: if (!isalpha(c) && c!='\'')
state = SPACE;
break;
case SPACE: if (isalpha(c) || c=='\'')
{
wcount++;
state = WORD;
}
}
c = getchar();
}
printf ("%d words\n", wcount);
return 0;
}
Upvotes: -1