ogs
ogs

Reputation: 1239

lack of understanding about sscanf usage

I would like to parse a specific line. So, I wrote the following piece of code in order to test the logic but I probably understand something wrongly :

typedef struct vers
{
   char tu8UVersion[5];
   char tu8UCommit[32];
}tst_prg_versions;

int main(int argc, char **argv)
{
    tst_prg_versions lstVer;
    char buf1[32];
    char buf2[32];

    char str[] = "BOARD-VERS-v1.0.0-git+9abc12345a";
    sscanf(str, "BOARD-VERS-v%5s-git+%s", lstVer.tu8UVersion, lstVer.tu8UCommit);
    printf("vers='%s'\n", lstVer.tu8UVersion);
    printf("commit='%s'\n", lstVer.tu8UCommit);

    sscanf(str, "BOARD-VERS-v%5s-git+%s", buf1, buf2);
    printf("vers='%s'\n", buf1);
    printf("commit='%s'\n", buf2);
    return 0;
}

Once executed it returns :

vers='1.0.09abc12345a'
commit='9abc12345a'
vers='1.0.0'
commit='9abc12345a

Why the first vers is equals to 1.0.09abc12345a and not 1.0.0 ?

Upvotes: 3

Views: 282

Answers (3)

sjsam
sjsam

Reputation: 21955

Why the first vers is equals to 1.0.09abc12345a and not 1.0.0 ?

Remember that you have

typedef struct vers
{
   char tu8UVersion[5];
   char tu8UCommit[32];
}tst_prg_versions;

I guess, there is a good chance the memory for tu8UVersion and tu8UCommit is contiguous. Since you have not null-terminated tu8UVersion when you do :

printf("vers='%s'\n", lstVer.tu8UVersion);

it goes on to print tu8UCommit and it stops because tu8UCommit is null terminated.

While sscanf seem the most sensible solution here you could also introduce some formatting :

char tu8UVersion[32];
   /*  version number can't get too big.
    *  So the first step is do allocated a
    *  reasonably - but not too - big size for it.
    *  So that you can be sure there are few empty bytes at the end.
    */

and then use a function to sanitize a string :

char* sanitized(char* ptr)
{
  if(ptr[strlen(ptr)]!='\0')  // include string.h for strlen
     ptr[strlen(ptr)]='\0';
  return ptr;
}

and print it like :

 printf("vers='%s'\n", sanitized(lstVer.tu8UVersion));

Upvotes: 2

M Oehm
M Oehm

Reputation: 29116

Your problem has already been identified in the comments: You don't leave space for the terminating null character and the two strings are run together.

If you want to scan a version whose size you don't know beforehand, you can limit the characters to scan to decimal digits and points with %[.-9] or to everything except a hyphen with %[^-]. (The %[...] format is like %s, except that you must provide a list of valid characters in the brackets. A caret as first letter means that the string is made up of characters that are not listed. In other words, %s is short for %[^ \t\n]

When you scan a string, you should test the return value of sscanf to be sure that all items have been scanned correctly and contain valid values.

Here's a variant that scans version numbers of up to 11 letters:

#include <stdlib.h>
#include <stdio.h>

typedef struct vers
{
   char tu8UVersion[12];
   char tu8UCommit[32];
} tst_prg_versions;

int main(int argc, char **argv)
{
    tst_prg_versions lstVer;

    char str[] = "BOARD-VERS-v1.0.0-git+9abc12345a";
    int n;

    n = sscanf(str, "BOARD-VERS-v%11[^-]-git+%s",
        lstVer.tu8UVersion, lstVer.tu8UCommit);

    if (n == 2) {
        printf("vers='%s'\n", lstVer.tu8UVersion);
        printf("commit='%s'\n", lstVer.tu8UCommit);
    } else {
        puts("Parse error.");
    }

    return 0;
}

Upvotes: 1

Aconcagua
Aconcagua

Reputation: 25518

The first actually reads 1.0.0! Problem is, however, that tu8UVersion is not null-terminated, thus printf (not sscanf) prints beyound the field (doing so is undefined behaviour, however, as noted by sjsam) - which is immediately followed by tu8UCommit (does not necessarily have to be so, there could still be some fill bytes in between for alignment reasons!).

You need to either print 5 characters at most (%.5s in printf format string) or leave place for terminating the tu8UVersion with 0, as proposed in a comment already.

Something similar could have happened with your buffers, too. You are lucky that they appearently have been initialized to 0 already (probably because of compiled as debug version), which again does not necessarily have to happen. So with bad luck, you could have printed the whole rest of buf1 (having been left at garbage) and even beyond.

Upvotes: 2

Related Questions