Reputation: 289
I'm confused about the behavior of scanf when handling whitespace.
Here's the code I'm using:
int main()
{
int val;
int c;
scanf("%d\t", &val);
c = getchar();
if(c == EOF)
printf("eof");
else if(c == '\t')
printf("tab");
else if(c == '\n')
printf("newline");
}
And here's the input that I'm passing it:
1234<tab><newline>
I would expect this to print "newline", since the scanf is only looking for tabs, and supposedly scanf leaves whitespace in the buffer by default. Instead, it prints "eof". The %d\t
seems to be swallowing the newline.
Am I missing something about how scanf works?
Note that if I change it to the following, it correctly prints "newline":
int main()
{
int val;
int c;
scanf("%d", &val);
getchar(); /* note this extra getchar */
c = getchar();
if(c == EOF)
printf("eof");
else if(c == '\t')
printf("tab");
else if(c == '\n')
printf("newline");
}
Upvotes: 1
Views: 3598
Reputation: 140629
You have encountered one of the more notorious reasons why *scanf
should never be used: the confusing whitespace handling. Your '\t'
doesn't just match a single tab, it matches any amount of any kind of whitespace, including the newline!
The best way to do this sort of thing, assuming you have getline
, looks like this:
#include <stdio.h>
#include <stdlib.h>
int main(void)
{
char *line = 0;
char *p;
long val;
(void) getline(&line, 0, stdin);
val = strtol(line, &p, 10);
if (*p == '\0')
puts("eof (no tab)");
else {
if (*p != '\t')
printf("no tab, %c instead\n", *p);
p++;
if (*p == '\0')
puts("eof");
else if (*p == '\t')
puts("tab");
else if (*p == '\n')
puts("newline");
}
free(line);
return 0;
}
If you don't have getline
, fgets
is often good enough. (WARNING: do not confuse fgets
with gets
. gets
is dangerous and should never be used, but fgets
is only inconvenient if you want your program to be robust in the face of extra-long input lines.)
Upvotes: 1
Reputation: 212248
"I'm confused about the behavior of scanf when handling whitespace." is a universal claim!
If there is any whitespace in the format string, it will consume all whitespace, so "\t" matches any string of whitespace.
Upvotes: 0
Reputation: 56089
Any amount of whitespace in the pattern (\t
) matches any amount of whitespace in the input (\t\n
).
From the man page:
White space (such as blanks, tabs, or newlines) in the format string match any amount of white space, including none, in the input.
Upvotes: 3