jsmith123
jsmith123

Reputation: 70

How to use fscanf() format string

I am using fscanf() to read input from a file (I know I should be using fgets() but I'm not allowed) and I can't figure out how to use the format string right.

The input is in the format: M 03f8ab8,1

I need the letter, address, and the number to each be saved to a variable. Here is what I've got so far:

while(fscanf(file, " %s %s, %d", operation, address, &size) != -1)

As written, it puts the letter into the correct var(operation), but adds ,number to the end of the address and then assigns something undefined to the size variable.

It should put each into its own respective variable (and ignore the comma)

How do I set up fscanf() to get this correctly?

Upvotes: 3

Views: 10865

Answers (2)

Some programmer dude
Some programmer dude

Reputation: 409166

The problem here is that the "%s" format reads space delimited strings, and since there's no space in 03f8ab8,1 it will all be read as a single string.

You can solve this with the "%[" format, which allows you to have some very simple pattern matching. You can for example use it to tell fscanf to read everything until (but not including) a comma. Like

fscanf(file, "%s %[^,], %d", operation, address, &size)

See e.g. this scanf (and family) reference for more details.

Also, you shouldn't be comparing the result of fscanf with -1, instead check that it parsed the correct number of sequences by comparing the return with 3:

while (fscanf(file, "%s %[^,], %d", operation, address, &size) == 3) ...

Note that the above format will not impose any limits on the strings it will read. That can lead to overflow of your strings. If your strings is of a fixed size (i.e. they're arrays) then use the format maximum field width to limit the number of characters that fscanf will read and put into your array.

For example (without knowing anything at all about your actual strings/arrays):

while (fscanf(file, "%1s %8[^,], %d", operation, address, &size) == 3) ...

With the above, the first string can't be longer than one single character, and the second can't be longer than eight characters. Note that these number do not include the string null-terminator (which you need space for in your arrays beyond the size given above).

Upvotes: 9

Gobikrishnan R
Gobikrishnan R

Reputation: 29

fscanf(input_fp, "%30[^ ,\n\t]%30[^ ,\n\t]%30[^ ,\n\t]", ...

does not consume the ',' nor the '\n' in the text file. Subsequent fscanf() attempts also fail and return a value of 0, which not being EOF, causes an infinite loop.


fscanf() solution, to a fgets()/sscanf() better handles potential IO and parsing errors:

main()
{
    FILE *input_fp;
    FILE *output_fp;
    char buf[100];
    while (fgets(buf, sizeof buf, input_fp) != NULL) 
    {
      char name[30];  // Insure this size is 1 more than the width in scanf format.
      char age_array[30];
      char occupation[30];
      #define VFMT " %29[^ ,\n\t]"
      int n;  // Use to check for trailing junk

      if (3 == sscanf(buf, VFMT "," VFMT "," VFMT " %n", 
          name, age_array, occupation, &n) && buf[n] == '\0') 
      {
        // Suspect OP really wants this width to be 1 more
        if (fprintf(output_fp, "%-30s%-30s%-30s\n", name, age_array, occupation) < 0)
          break;
      } else
        break;  // format error
    }
    fclose(input_fp);
    fclose(output_fp);
}

Rather than call ferror(), check return values of fgets(), fprintf().

Suspect OP's undeclared field buffers were [30] and adjusted scanf() accordingly.

Details about if (3 == sscanf(buf, VFMT "," ...

The if (3 == sscanf(...) && buf[n] == '\0') { becomes true when:

1) exactly the 3 "%29[^ ,\n\t]" format specifiers each scanf in at least 1 char each.

2) buf[n] is the end of the string. n is set via the "%n" specifier. The preceding ' ' in " %n" causes any following white-space after the last "%29[^ ,\n\t]" to be consumed. scanf() sees "%n", which directs it to set the current offset from the beginning of scanning to be assign to the int pointed to by &n.

"VFMT "," VFMT "," VFMT " %n" is concatenated by the compiler to

" %29[^ ,\n\t], %29[^ ,\n\t], %29[^ ,\n\t] %n".

I find the former easier to maintain than the latter.

The first space in " %29[^ ,\n\t]" directs sscanf() to scan over (consume and not save) 0 or more white-spaces (' ', '\t', '\n', etc.). The rest directs sscanf() to consume and save any 1 to 29 char except ',', '\n', '\t', then append a '\0'.

Upvotes: 1

Related Questions