Reputation: 33
I have a source code in C.
#include <stdio.h>
#define IN_W 1
#define OUT_W 0
#define SPACE 32
#define TAB 9
int main() {
int c, state, temp;
state = OUT_W;
while ((c = getchar()) != EOF) {
if ((c != SPACE || c != TAB) && (state == OUT_W)) {
state = IN_W;
temp = c;
c = 13;
putchar(c);
c = 10;
putchar(c);
putchar(temp);
} else if (c != SPACE || c != TAB)
putchar(c);
else
state = OUT_W;
}
return 0;
}
What I want to achieve is I will type in some characters/words and catch those inputs by getchar. When ever getchar receive any characters besides space or tab, it will print a new line and then print those characters until it found a space or tab (abandon them). For example, when I type
123 eat 4bananas in themorning
the program will print
123
eat
4bananas
in
themorning
I tried to integrate it with CR or LF, but it still print "123 eat 4bananas in themorning".
My questions are: 1. What did I miss? 2. In the last 'else', which one is more efficient for the running program:
else
state = OUT_W;
or
else if ((c == SPACE || c == TAB) && state == IN_W)
state = OUT_W;
else
continue; // or can I use single ';' since we do nothing in here?
That's all. Thank you for your help.
Note: I tried playing with '\n' and '\t' too.
Regards, Mario
Upvotes: 3
Views: 3269
Reputation: 881113
For a start:
(c != SPACE || c != TAB)
is always true. A character cannot be both space and tab at the same time, hence it must always be either a non-tab or non-space. I suspect you meant:
(c != SPACE && c != TAB)
That's why the state is never going back to OUT_W
, because after the first line end sequence, the second if
statement is always true, so it will never get to that final else
bit.
The following code works okay:
#include <stdio.h>
#define IN_W 1
#define OUT_W 0
#define SPACE 32
#define TAB 9
int main (void) {
int c, state, temp;
state = OUT_W;
while ((c = getchar()) != EOF) {
if ((c != SPACE && c != TAB) && (state == OUT_W)) {
state = IN_W;
temp = c;
c = 13;
putchar(c);
c = 10;
putchar(c);
putchar(temp);
} else if (c != SPACE && c != TAB)
putchar(c);
else
state = OUT_W;
}
return 0;
}
although it still has that annoying initial newline, which you can fix by simply setting the initial state to IN_W
.
There's also a lot of magic numbers in your code and some rather unnecessary moving of values. Possibly a more polished version would be:
#include <stdio.h>
#define IN_W 1
#define OUT_W 0
#define SPACE ' '
#define TAB '\t'
#define CR '\r'
#define LF '\n'
int main (void) {
int c, state;
state = IN_W;
while ((c = getchar()) != EOF) {
if ((c != SPACE) && (c != TAB) && (state == OUT_W)) {
putchar(CR);
putchar(LF);
putchar(c);
state = IN_W;
} else if ((c != SPACE) && (c != TAB))
putchar(c);
else
state = OUT_W;
}
return 0;
}
One thing I will mention is that it's often preferable to separate the state machine itself from the actions carried out. To that end, I would make the primary choice based on the current state rather than the character/state pair, and separate the actions for each state from the state machine.
I think that makes things a lot more readable, and easier to modify:
#include <stdio.h>
enum tState { ST_WORD, ST_SPACE };
static enum tState doWord (int ch) {
if ((ch == ' ') || (ch == '\t')) {
putchar ('\r');
putchar ('\n');
return ST_SPACE;
}
putchar (ch);
return ST_WORD;
}
static enum tState doSpace (int ch) {
if ((ch == ' ') || (ch == '\t'))
return ST_SPACE;
putchar (ch);
return ST_WORD;
}
int main (void) {
int ch;
enum tState state = ST_WORD;
while ((ch = getchar()) != EOF) {
switch (state) {
case ST_WORD: state = doWord (ch); break;
case ST_SPACE: state = doSpace (ch); break;
}
}
return 0;
}
Upvotes: 1
Reputation: 76695
This expression is not what you want: (c != SPACE || c != TAB)
This is always true. If c
is SPACE
then it is not TAB
, so the second part would be true. If c
is TAB then it is not SPACE
so the first part would be true.
In both cases, what you want is (c != SPACE && c != TAB)
This is only true when c
is not SPACE
and also not TAB
. The operator &&
is Boolean "and".
Also, I suggest that instead of magic numbers like 13
you should use C character constants like '\r'
.
As for your second question, your program is not too bad as written. I definitely don't think you would improve it by putting in a continue
and I don't even quite see how it would work. (As you noted, if the else continue;
is at the very end of the loop, you can leave out the continue
; actually, you could then just chop off the whole else
, because else;
does nothing.)
You have written a little state machine. You have three interesting cases:
There is a fourth possibility:
If you want the most efficient code, I think it would be best to rearrange it so that you only need to check for SPACE
and TAB
in one place:
while ((c = getchar()) != EOF) {
if (c == SPACE || c == TAB) {
state = OUT_W;
}
else {
/* c is not SPACE or TAB so we will print it */
if (state == OUT_W) {
/* transition from OUT_W to IN_W */
state = IN_W;
putchar('\r');
putchar('\n');
putchar(c);
}
else
putchar(c);
}
}
And with this restructured version of the code, it becomes clear that any time you are in IN_W you print the character, but only on the transition you print the CR/LF. So you could shorten this to not have an else
, always call putchar(c);
, but do that after the check for the transition. I will leave that as an exercise for you.
Upvotes: 3