Okkaaj
Okkaaj

Reputation: 125

Why sscanf does not work for this situation?

Currently I am reading file, and printing (stdout) all words/strings that it contains.

Here is the code:

int scan_strings(FILE *in, FILE *out) 
{
    char buffer[64];
    int i = 0, n = 0;

    for(;;)
    {
        if (fscanf(in, "%*[^" charset "]") != EOF)
        {
            i = 0;
            while (fscanf(in, "%63[" charset "]%n", buffer, &n) == 1)
            {
                if (n < 4 && i == 0)
                {
                    break;
                }
                else
                {
                    i = 1;
                }

                fputs(buffer, out);
            }
            if (i != 0)
            {
                putc('\n', out);
            }
        }
        if (feof(in))
        {
            return 0;
        }
        if (ferror(in) || ferror(out))
        {
            return -1;
        }
    }
}

But what I am trying to do, is to search the strings from a buffer which is already read to memory.

I changed in and out variables to unsigned char* and changed fscanf to sscanf. That however doesn't work. Am I misunderstanding the sscanf function, or is there something else wrong in my code?

How I can print all strings from already-read buffer? The data is binary data.

I am working on Windows and Linux portability isn't needed.

Upvotes: 0

Views: 317

Answers (2)

chux
chux

Reputation: 154242

sscanf(data, "%*[^" charset "]") works differently from fscanf(in, "%*[^" charset "]"). when data is binary.

Assume charset is some string like "123".

fscanf(in, "%*[^123]") will scan in as long as the char read is not '1', '2', or '3'.
This includes '\0'.

sscanf(data, "%*[^123]") will scan data as long as the char read is not '1', '2', or '3'.
This does not include '\0' as sscanf quits offering char to scan once '\0' is encountered.

Using sscanf() to scan over '\0' is not possible.


[Edit]

OP: How should I go about doing it - for binary data(from buffer/variable)?
A: Additional code around sscanf() can be used to cope with its stopping a scan when '\0' is encountered. Something like just for the first sscanf():

size_t j=0;
for (;;) {

  // if (fscanf(in, "%*[^" charset "]") != EOF)
  while (j < datasize) {
    int n = 0;
    sscanf(&data[j], "%*[^123]%n", &n);
    if (n > 0) j += n;
    else if (data[j] == '\0') j++;
    else break; 
  }

  if (j < datasize) {
    i = 0;
    ...

As you can see things are getting ugly.
Let's try using strchr() with untested code:

size_t j=0;
for (;;) {

  while (j < datasize) {
    int ch = data[j];
    if (ch && strchr(charset, ch) != NULL) break;
    j++;
  }

  if (j < datasize) {
    i = 0;
    ...

Getting better and this is only for the first sscanf().

Upvotes: 1

Roddy
Roddy

Reputation: 68074

The problem is that your code never modifies in. When in is a file fscanf will move through it sequentially. But sscanf doesn't work that way.

You need to find out how many characters sscanf read, and then increment in accordingly.

You're already getting the number of bytes read in n, so just add that to in.

 in += n;

... after the sscanf.

Upvotes: 0

Related Questions