Reputation: 71
Here is my current code:
unsigned long charcount = 0;
unsigned long wordcount = 0;
unsigned long linecount = 0;
int n;
for (; (n = getchar()) != EOF; ++charcount) {
if (n == '\n')
++linecount;
if (n == ' ' || n == '\n' || n == '\t')
++wordcount;
printf("%lu %lu %lu\n", charcount, wordcount, linecount);
}
I think there's an issue with this code such that if the text file I'm reading from consists of paragraphs because of the newline that separates them,they would count as words. I'm not sure how I would fix it so that they don't count as words.
Upvotes: 0
Views: 117
Reputation: 154582
Consider using these definitions:
line beginning: present character is the first or previous was '\n'
.
word beginning: present character is not a white-space and either it is the first or the previous character was a separator (white-space).
This approach detects the beginning of a line/word.
unsigned long charcount = 0;
unsigned long wordcount = 0;
unsigned long linecount = 0;
int previous = '\n';
int n;
while ((n = getchar()) != EOF) {
++charcount;
if (isspace(previous)) {
if (!issspace(n)) ++wordcount; // Beginning of word detected
if (previous == '\n') ++linecount; // Beginning of line detected
}
previous = n;
}
printf("%lu %lu %lu\n", charcount, wordcount, linecount);
This approach work well including the following conditions:
Multiple spaces are treat like a single space (separator).
File beginning with spaces or not, does not throw off word count.
File ending with spaces or not, does not throw off word count.
Last line need not end with a '\n'
.
Zero length files are not a problem.
No post EOF
code needed to adjust line/word
count.
No word/line/char
length limitation other than ULONG_MAX
.
Details on OP's code
for (; (n = getchar()) != EOF; ++charcount) {
// This fails to count the last line of a file should it lack a \n
if (n == '\n')
++linecount;
// This counts separator (white-space) occurrence.
// Multiple spaces count as 2 words: not good
// Files like "Hello" will count as 0 words: not good
// Files like " Hello " will count as 2 words: not good
if (n == ' ' || n == '\n' || n == '\t')
++wordcount;
// Using `unsigned long` is good, maybe even `unsigned long long`.
printf("%lu %lu %lu\n", charcount, wordcount, linecount);
}
OP it not getting enough "words". Let us assume and non-letter is a valid word separator.
unsigned long charcount = 0;
unsigned long wordcount = 0;
unsigned long linecount = 0;
int previous = '\n';
int n;
while ((n = getchar()) != EOF) {
++charcount;
if (!isalpha(previous)) {
if (previous == '\n') ++linecount; // Beginning of line detected
if (isalpha(n)) ++wordcount; // Beginning of word detected
}
previous = n;
}
printf("%lu %lu %lu\n", charcount, wordcount, linecount);
Upvotes: 1
Reputation: 18865
Use indicator which specifies whether you're reading word or whitespace:
int isWord = 0;
while ((n = getchar()) != EOF) {
if (isspace(n)) {
if (n == '\n') ++linecount;
if (isWord) {
++wordcount;
isWord = 0;
}
}
else {
isWord = 1;
}
}
if (isWord)
++wordcount;
Upvotes: 2
Reputation: 368
Try to store the previous char in a variable and compare.
unsigned long int charcount = 0;
unsigned long int wordcount = 0;
unsigned long int linecount = 0;
int n;
int prev='\n';
while ((n = getchar()) != EOF){
charcount++;
if (n == '\n' && prev != '\n'){
linecount++;
}
if (n == ' ' || n == '\n' || n == '\t')
wordcount++;
}
prev=n
}
printf( "%lu %lu %lu\n", charcount, wordcount, linecount );
Upvotes: 0