Graphics Noob
Graphics Noob

Reputation: 10050

Can fscanf() read whitespace?

I've already got some code to read a text file using fscanf(), and now I need it modified so that fields that were previously whitespace-free need to allow whitespace. The text file is basically in the form of:

title: DATA
title: DATA
etc...

which is basically parsed using fgets(inputLine, 512, inputFile); sscanf(inputLine, "%*s %s", &data);, reading the DATA fields and ignoring the titles, but now some of the data fields need to allow spaces. I still need to ignore the title and the whitespace immediately after it, but then read in the rest of the line including the whitespace.

Is there anyway to do this with the sscanf() function?

If not, what is the smallest change I can make to the code to handle the whitespace properly?

UPDATE: I edited the question to replace fscanf() with fgets() + sscanf(), which is what my code is actually using. I didn't really think it was relevant when I first wrote the question which is why I simplified it to fscanf().

Upvotes: 6

Views: 50783

Answers (6)

Norman Ramsey
Norman Ramsey

Reputation: 202475

You're running up against the limits of what the *scanf family is good for. With fairly minimal changes you could try using the string-scanning modules from Dave Hanson's C Interfaces and Implementations. This stuff is a retrofit from the programming language Icon, an extremely simple and powerful string-processing language which Hanson and others worked on at Arizona. The departure from sscanf won't be too severe, and it is simpler, easier to work with, and more powerful than regular expressions. The only down side is that the code is a little hard to follow without the book—but if you do much C programming, the book is well worth having.

Upvotes: 1

pmg
pmg

Reputation: 108968

If you cannot use fgets() use the %[ conversion specifier (with the "exclude option"):

char buf[100];
fscanf(stdin, "%*s %99[^\n]", buf);
printf("value read: [%s]\n", buf);

But fgets() is way better.


Edit: version with fgets() + sscanf()

char buf[100], title[100];
fgets(buf, sizeof buf, stdin); /* expect string like "title: TITLE WITH SPACES" */
sscanf(buf, "%*s %99[^\n]", title);

Upvotes: 14

Chris Dodd
Chris Dodd

Reputation: 2960

A %s specifier in fscanf skips any whitespace on the input, then reads a string of non-whitespace characters up to and not including the next whitespace character.

If you want to read up to a newline, you can use %[^\n] as a specifier. In addition, a ' ' in the format string will skip whitespace on the input. So if you use

fscanf("%*s %[^\n]", &str);

it will read the first thing on the line up to the first whitespace ("title:" in your case), and throw it away, then will read whitespace chars and throw them away, then will read all chars up to a newline into str, which sounds like what you want.

Be careful that str doesn't overflow -- you might want to use

fscanf("%*s %100[^\n]", &str)

to limit the maximum string length you'll read (100 characters, not counting a terminating NUL here).

Upvotes: 3

Pavel Minaev
Pavel Minaev

Reputation: 101555

If you insist on using scanf, and assuming that you want newline as a terminator, you can do this:

scanf("%*s %[^\n]", str);

Note, however, that the above, used exactly as written, is a bad idea because there's nothing to guard against str being overflown (as scanf doesn't know its size). You can, of course, set a predefined maximum size, and specify that, but then your program may not work correctly on some valid input.

If the size of the line, as defined by input format, isn't limited, then your only practical option is to use fgetc to read data char by char, periodically reallocating the buffer as you go. If you do that, then modifying it to drop all read chars until the first whitespace is fairly trivial.

Upvotes: 3

Matteo Italia
Matteo Italia

Reputation: 126777

The simplest thing would be to issue a

fscanf("%*s");

to discard the first part and then just call the fgets:

fgets(str, stringSize, filePtr);

Upvotes: 3

Andreas Bonini
Andreas Bonini

Reputation: 44742

I highly suggest you stop using fscanf() and start using fgets() (which reads a whole line) and then parse the line that has been read.

This will allow you considerably more freedom in regards to parsing non-exactly-formatted input.

Upvotes: 3

Related Questions