Lou Kosak
Lou Kosak

Reputation: 289

scanf buffering with tabs and newlines

I'm confused about the behavior of scanf when handling whitespace.

Here's the code I'm using:

int main()
{
  int val;
  int c;

  scanf("%d\t", &val);

  c = getchar();

  if(c == EOF)
    printf("eof");
  else if(c == '\t')
    printf("tab");
  else if(c == '\n')
    printf("newline");
}

And here's the input that I'm passing it:

1234<tab><newline>

I would expect this to print "newline", since the scanf is only looking for tabs, and supposedly scanf leaves whitespace in the buffer by default. Instead, it prints "eof". The %d\t seems to be swallowing the newline.

Am I missing something about how scanf works?

Note that if I change it to the following, it correctly prints "newline":

int main()
{
  int val;
  int c;

  scanf("%d", &val);

  getchar(); /* note this extra getchar */

  c = getchar();

  if(c == EOF)
    printf("eof");
  else if(c == '\t')
    printf("tab");
  else if(c == '\n')
    printf("newline");
}

Upvotes: 1

Views: 3598

Answers (3)

zwol
zwol

Reputation: 140629

You have encountered one of the more notorious reasons why *scanf should never be used: the confusing whitespace handling. Your '\t' doesn't just match a single tab, it matches any amount of any kind of whitespace, including the newline!

The best way to do this sort of thing, assuming you have getline, looks like this:

#include <stdio.h>
#include <stdlib.h>

int main(void)
{
    char *line = 0;
    char *p;
    long val;

    (void) getline(&line, 0, stdin);

    val = strtol(line, &p, 10);
    if (*p == '\0')
        puts("eof (no tab)");
    else {
        if (*p != '\t')
            printf("no tab, %c instead\n", *p);
        p++;
        if (*p == '\0')
            puts("eof");
        else if (*p == '\t')
            puts("tab");
        else if (*p == '\n')
            puts("newline");
    }

    free(line);
    return 0;
}

If you don't have getline, fgets is often good enough. (WARNING: do not confuse fgets with gets. gets is dangerous and should never be used, but fgets is only inconvenient if you want your program to be robust in the face of extra-long input lines.)

Upvotes: 1

William Pursell
William Pursell

Reputation: 212248

"I'm confused about the behavior of scanf when handling whitespace." is a universal claim!

If there is any whitespace in the format string, it will consume all whitespace, so "\t" matches any string of whitespace.

Upvotes: 0

Kevin
Kevin

Reputation: 56089

Any amount of whitespace in the pattern (\t) matches any amount of whitespace in the input (\t\n).

From the man page:

White space (such as blanks, tabs, or newlines) in the format string match any amount of white space, including none, in the input.

Upvotes: 3

Related Questions