Reputation: 1239
I would like to parse a specific line. So, I wrote the following piece of code in order to test the logic but I probably understand something wrongly :
typedef struct vers
{
char tu8UVersion[5];
char tu8UCommit[32];
}tst_prg_versions;
int main(int argc, char **argv)
{
tst_prg_versions lstVer;
char buf1[32];
char buf2[32];
char str[] = "BOARD-VERS-v1.0.0-git+9abc12345a";
sscanf(str, "BOARD-VERS-v%5s-git+%s", lstVer.tu8UVersion, lstVer.tu8UCommit);
printf("vers='%s'\n", lstVer.tu8UVersion);
printf("commit='%s'\n", lstVer.tu8UCommit);
sscanf(str, "BOARD-VERS-v%5s-git+%s", buf1, buf2);
printf("vers='%s'\n", buf1);
printf("commit='%s'\n", buf2);
return 0;
}
Once executed it returns :
vers='1.0.09abc12345a'
commit='9abc12345a'
vers='1.0.0'
commit='9abc12345a
Why the first vers is equals to 1.0.09abc12345a
and not 1.0.0
?
Upvotes: 3
Views: 282
Reputation: 21955
Why the first vers is equals to 1.0.09abc12345a and not 1.0.0 ?
Remember that you have
typedef struct vers
{
char tu8UVersion[5];
char tu8UCommit[32];
}tst_prg_versions;
I guess, there is a good chance the memory for tu8UVersion
and tu8UCommit
is contiguous. Since you have not null-terminated tu8UVersion
when you do :
printf("vers='%s'\n", lstVer.tu8UVersion);
it goes on to print tu8UCommit
and it stops because tu8UCommit
is null terminated.
While sscanf seem the most sensible solution here you could also introduce some formatting :
char tu8UVersion[32];
/* version number can't get too big.
* So the first step is do allocated a
* reasonably - but not too - big size for it.
* So that you can be sure there are few empty bytes at the end.
*/
and then use a function to sanitize a string :
char* sanitized(char* ptr)
{
if(ptr[strlen(ptr)]!='\0') // include string.h for strlen
ptr[strlen(ptr)]='\0';
return ptr;
}
and print it like :
printf("vers='%s'\n", sanitized(lstVer.tu8UVersion));
Upvotes: 2
Reputation: 29116
Your problem has already been identified in the comments: You don't leave space for the terminating null character and the two strings are run together.
If you want to scan a version whose size you don't know beforehand, you can limit the characters to scan to decimal digits and points with %[.-9]
or to everything except a hyphen with %[^-]
. (The %[...]
format is like %s
, except that you must provide a list of valid characters in the brackets. A caret as first letter means that the string is made up of characters that are not listed. In other words, %s
is short for %[^ \t\n]
When you scan a string, you should test the return value of sscanf
to be sure that all items have been scanned correctly and contain valid values.
Here's a variant that scans version numbers of up to 11 letters:
#include <stdlib.h>
#include <stdio.h>
typedef struct vers
{
char tu8UVersion[12];
char tu8UCommit[32];
} tst_prg_versions;
int main(int argc, char **argv)
{
tst_prg_versions lstVer;
char str[] = "BOARD-VERS-v1.0.0-git+9abc12345a";
int n;
n = sscanf(str, "BOARD-VERS-v%11[^-]-git+%s",
lstVer.tu8UVersion, lstVer.tu8UCommit);
if (n == 2) {
printf("vers='%s'\n", lstVer.tu8UVersion);
printf("commit='%s'\n", lstVer.tu8UCommit);
} else {
puts("Parse error.");
}
return 0;
}
Upvotes: 1
Reputation: 25518
The first actually reads 1.0.0! Problem is, however, that tu8UVersion is not null-terminated, thus printf (not sscanf) prints beyound the field (doing so is undefined behaviour, however, as noted by sjsam) - which is immediately followed by tu8UCommit (does not necessarily have to be so, there could still be some fill bytes in between for alignment reasons!).
You need to either print 5 characters at most (%.5s
in printf format string) or leave place for terminating the tu8UVersion with 0, as proposed in a comment already.
Something similar could have happened with your buffers, too. You are lucky that they appearently have been initialized to 0 already (probably because of compiled as debug version), which again does not necessarily have to happen. So with bad luck, you could have printed the whole rest of buf1 (having been left at garbage) and even beyond.
Upvotes: 2