razz
razz

Reputation: 10110

sscanf unexpected results inside a loop

I am trying to parse this simple configuration file using fgets and sscanf:

# configuration file for client
[user]
ID      34DV4gx7
NAME    Somebody

I wrote the following script to parse it, in which sscanf initially seems to be extracting the variables properly then for some unknown reason it mixes them up:

int main (void)
{
    FILE *conf;
    char *confname = "client.conf";
    char buf[256], tmp[256];
    char id[8];
    char name[12];
    char token[40];
    size_t i, count = 0, valid = 0, len = sizeof token;

    if ((conf = fopen (confname, "r")) == NULL)
    {
        fprintf (stderr, "Failed to open configuration file %s\n", confname);
        return 1;
    }
    memset (id, 0, sizeof id);
    memset (name, 0, sizeof name);
    memset (token, 0, sizeof token);
    while (!feof (conf))
    {
        memset (buf, 0, sizeof buf);
        memset (tmp, 0, sizeof tmp);
        if (fgets (buf, sizeof buf, conf) == NULL) continue;
        if (buf[0] == '#' || buf[0] == '[') continue;
        if (sscanf (buf, "ID %s", tmp) == 1)
        {
            strncpy (id, tmp, sizeof id);
            id[strlen (id)] = '\0';
            printf ("id: %s[%d]\n", id, strlen (id));
            valid++;
            continue;
        }
        else if (sscanf (buf, "NAME %s", tmp) == 1)
        {
            strncpy (name, tmp, sizeof name);
            name[strlen (name)] = '\0';
            printf ("name: %s[%d]\n", name, strlen (name));
            valid++;
            continue;
        }
    }
    fclose (conf);

    printf ("id: %s\n", id);
    printf ("name: %s\n", name);

    if (valid != 2) return 2;
    for (i = 0; i < strlen (id) && count < len; i++) token[count++] = id[i];
    token[count++] = ':';
    for (i = 0; i < strlen (name) && count < len; i++) token[count++] = name[i];
    token[count] = '\0';
    printf ("token: %s\n", token);

    return 0;
}

Result:

id: 34DV4gx7[8]
name: Somebody[8]
id: 34DV4gx7Somebody
name: Somebody
token: 34DV4gx7Somebody:Somebody

Expected:

id: 34DV4gx7[8]
name: Somebody[8]
id: 34DV4gx7
name: Somebody
token: 34DV4gx7:Somebody

I tried many things to find out what's causing this behaviour but got nothing, I thought it could be that the id and name variables are not null terminated so I manually added \0 at the end and then I thought it could be that the buf is getting overwritten in the loop so I used memset to reset it and also reset all the char arrays and checked the length of everything but I just can't see what's going wrong. Any help would be greatly appreciated.

Upvotes: 2

Views: 146

Answers (1)

dbush
dbush

Reputation: 223972

As was mentioned in the comments, you're not properly adding the null byte to the end of id and name after calling strncpy.

From the man page:

The strncpy() function is similar, except that not more than n bytes of src are copied. Thus, if there is no null byte among the first n bytes of src, the result will not be null-terminated.

So after using strncpy you need to manually add a null byte as the last byte of the array. What you're doing instead is using strlen to find the length of the string. This function only works if the string is correctly null terminated, which after the strncpy call it might not be.

So instead of this:

id[strlen (id)] = '\0';
...
name[strlen (name)] = '\0';

Do this:

id[sizeof id - 1] = '\0';
...
name[sizeof name - 1] = '\0';

This adds the null byte as the last character.

Now to explain the behavior you were seeing:

When you first read in id, all 8 bytes of this array were populated with the 8 bytes of the string in question. It printed correctly because name appears in memory immediately after id (I'll explain how I know this momentarily) and name was initialized to all zeros outside of the loop, so the first byte of name (which contains a null byte) effectively terminates id.

Then when you read in name, the null terminator for id (which was actually in name) was overwritten. Then when you later print id, it prints the bytes from id but doesn't find the null byte, so it keeps reading the bytes where name lives until it finds the null terminator for that string and prints 34DV4gx7Somebody. The fact that id printed this was is how we know that name appears immediately after id in memory (in this particular case).

The reason you saw the error with id but not with name is because id wasn't big enough for the string you read in (so null terminator was not added), but name was big enough for its string (so a null terminator was added).

Upvotes: 3

Related Questions