Reputation: 3692
I'm learning C using the K&R book, on a windows machine. I am trying out the program(bare bones Unix word count
) which counts lines, characters, and words. Although this program correctly counts the number of characters, the no. of lines and words in my output are always 0 and 1, irrespective of what I enter. I also am somewhat stumped by one part of the program, which I'll get to next-
#include<stdio.h>
#define IN 1
#define OUT 0
int main()
{
int c,state, nc,nw,nl;
nl=nw=nc=0;
state=OUT;
while(c=getchar()!=EOF)
{
++nc;
if(c=='\n')
++nl;
if(c=='\n'||c=='\t'||c==' ')
state=OUT;
else if(state==OUT)
{
state=IN;
++nw;
}
}
printf("\n No. of characters, lines and words are : %d,%d,%d\n",nc,nl,nw);
return 0;
}
From what it looks, this program is using nc
, nl
and nw
, respectively, to count the number of characters, lines and words entered in the input stream. My understanding of the program logic, thus far, is -
IN
and OUT
are two variables used to indicate the current state of the program. IN
indicates that the program is currently 'inside' a word- in other words- no space, newline or tab has been encountered so far in the characters entered. Or so I think.while
loop, the STATE
is set to out
. This indicates that right now, there is no word encountered. Ctrl+Z
), the number of character nc
is incremented. In the first if statement
, if the character is a newline '\n'
, nl
is incremented. This should keep track of the number of lines encountered.STATE
to 0, whenever there is a blank, newline or tab. I've understood the logic thus far.STATE
is OUT
. Now, STATE
will be out
in two conditions:
when the program runs for the first time, and STATE
is set to 0
before the while loop. Example- Consider the input WORD
. Here, before W
is encountered, STATE
is set to 0
.
Now that STATE
is 0
, and input is W
we come to the else if statement
. The next input after W
is O
. So, STATE
is set to 1
(indicating the program is inside a word), and the word count is incremented. WORD
, what happens when R
is encountered? What is the value of STATE
now? Is it still 1
because it was set to 1
inside the last else-if statement? But then again, if that is 1
, there is no condition for when STATE
is 1
.Lastly, it's obvious that the program is flawed in some way, because in my sample output below, the number of lines and words are always fixed(0 and 1).
hello word
good morning
^Z
No. of characters, lines and words are : 24,0,1
I understand that my question is very long, but I'm really stumped and looking for answers to two major points:
Many thanks for your help
Upvotes: 3
Views: 805
Reputation: 338
How does the else-if statement logic work.
The IF statement check if there is a new line, a space or a tabulation, to CUT a word, so if there is, it put the "state" variable to OUT.
The next loop turn, if the "c" variable is not a new line or tabulation or space, because "state" variable is OUT, the ELSE IF is called.
The ELSE IF increment the nw, because after a space a tabulation or a new line (and if it's not another one) it's a new word. And put back the "state" variable to IN, to return to the IF statement.
EXAMPLE:
"WORD" => "W" -> nc++ nw++ state=OUT, "O" -> nc++ state=IN, "R" -> nc++ state=IN, "D" -> nc++ state=IN
"WO RD" => "W" -> nc++ nw++ state=OUT, "O" -> nc++ state=IN, " " -> nc++ state=OUT, "R" -> nc++ nw++ state=IN, "D" -> nc++ state=IN
And if you want to understand easely, add just after the while statement:
while((c=getchar())!=EOF)
{
printf("number of char = %d, number of words = %d, number or lines = %d, state = %d",nc,nw,nl,state)
So you'll see what the code does after each loop turn.
Upvotes: 3
Reputation: 5169
Here is a very basic walk-through the fixed code. I hope it will answer all of the original questions.
The only other suggestion is to enable and check compiler warning messages, as they often have clues about potent sources of errors. In fact, gcc
, and clang
will warn about the original program and suggest the correct fix.
Include the standard (std) Input/Output header files
#include <stdio.h>
Use the pre-processor to define to (constant) macros, which are used to represent the state of either being IN
-side a word, or OUT
-side a word. The definition for "outside" means that the current character (c) is a white space in this program.
White space being a character that does not display anything, but may modify the output, such as moving to the next character location (space), to the next tab stop (tab), or advancing to the next line (newline).
#define IN 1
#define OUT 0
Being a simple program, the program is located in the main
function. That is okay for a short program like this one, but not a good idea in larger, more complex programs.
int main(int argc, char* argv[])
{
int c; /* This is a 'current' character being read from input */
int state; /* The state of being either IN- or OUT-side of a word. */
int nc; /* Count of number of characters read */
int nw; /* Count of number of "words" */
int nl; /* Line count */
nl = nw = nc = 0; /* Initialize the counts to zero */
state = OUT; /* Begin with the word 'state' being OUT-side of a word */
Get a single character from standard input (stdin), assign it to
the variable c
. This is done first because of the (added) parenthesis enclosing the expression c = getchar()
. Then the result of this assignment (which is equal to c
) is compared to EOF (end of file).
While the contents of c
are not equal to EOF
, the while
loop's body executes repeatedly, until the getchar()
does assign an EOF
to c
.
while ( EOF != (c = getchar()) )
{
Since you have a new character increment the character count, nc
, variable by one.
++nc;
If c
is a newline, increment the number of lines, nl
, count.
if (c == '\n')
++nl;
If the variable c
is a newline, tab, or space, then sent the state
variable to OUT
, because they indicate that c
is not part of a "word."
if (c == '\n' || c== '\t' || c == ' ') {
state = OUT;
}
If the previous if
statement did not evaluate to true, follow the else
statement.
The else statement consists of a second if
statement which evaluates whether state
is equal to OUT
. If so, then execute the next block.
else if (state == OUT)
{
This block contains the two statements, set state
to IN
, and increment the value of nw
(word count).
state = IN;
++nw;
} /* end of "else if" block */
} /* end of while loop block */
After getchar()
returns an EOF
(end of file), and the while loop ends, the program prints this summary output before returning zero to the parent process (don't worry about that here, it's just house-keeping) and ending the program.
printf("\n No. of characters, lines and words are : %d, %d, %d\n", nc, nl, nw);
return 0;
} /* end of main */
Upvotes: 0
Reputation: 182619
You are getting wrong input because you are missing parentheses:
while((c=getchar())!=EOF)
^ ^
Without them you always compare the return value of getchar()
with EOF
and assign the result of this comparison to c
. That is, c will always be either 1 or 0.
Upvotes: 6