Hari Seldon
Hari Seldon

Reputation: 1060

strncat looks like its preserving data across calls?

I am seeing some odd behavior out of the strncat function in the string.h standard library, and would like some help understanding what is going on.


The crux of my issue is in a function I created called readLine with the purpose of returning the lines of a file as char * strings without a trailing newline terminator. That function looks like this:

char * readLine(FILE * fp) {
    char * chunk = NULL;
    char * line = NULL;
    int count = 0;

    // iterate through chunks of a line until we reach the end (or an error)
    while (fgets((chunk = malloc(sizeof(char) * BUFFER_SIZE)), BUFFER_SIZE, fp) != NULL) {

        // realloc on a null pointer works like malloc  
        line = realloc(line, ++count * BUFFER_SIZE * sizeof(char));    

        printf("chunk's contents: %s\n", chunk);

        // does chunk contain the end of a line?
        if(strchr(chunk, '\n') == NULL) {   
            // concatenate string parts and continue loop
            strncat(line, chunk, strlen(chunk) + 1);        
            free(chunk);

        } else {
            // we want to return a \0 terminated string without the \n
            // expected position of \n in chunk is ({length of chunk}-1)
            chunk[(strlen(chunk) - 1)] = '\0';

            // concatenate string parts
            strncat(line, chunk, strlen(chunk) + 1);
            printf("readLine line:    %s\n", line);
            free(chunk);

            break;        
        }        
    }
    return line;
}

I call it in main in a loop that looks like this:

FILE * fp = NULL;

if ((fp = fopen(FILE_PATH, "r")) != NULL) {
    char * line = NULL;

    while ((line = readLine(fp)) != NULL) {
        printf("main line:        %s\n\n", line);
        free(line);
    }

    fclose(fp);
}

Now the odd behavior comes in my definition of #define BUFFER_SIZE 1000. With it set like that I get the following output (which is not what I want):

chunk's contents: I am on line 1
readLine line:    I am on line 1
main line:        I am on line 1

chunk's contents: Over here I am on line 2
readLine line:    I am on line 1Over here I am on line 2
main line:        I am on line 1Over here I am on line 2

chunk's contents: Line 3 here
readLine line:    I am on line 1Over here I am on line 2Line 3 here
main line:        I am on line 1Over here I am on line 2Line 3 here

chunk's contents: Look out for 4
readLine line:    I am on line 1Over here I am on line 2Line 3 hereLook out for 4
main line:        I am on line 1Over here I am on line 2Line 3 hereLook out for 4

chunk's contents: Johnny 5 alive!
readLine line:    I am on line 1Over here I am on line 2Line 3 hereLook out for 4Johnny 5 alive!
main line:        I am on line 1Over here I am on line 2Line 3 hereLook out for 4Johnny 5 alive!

BUT if i change that definition to something like #define BUFFER_SIZE 20, I get the output I am looking for:

chunk's contents: I am on line 1

readLine line:    I am on line 1
main line:        I am on line 1

chunk's contents: Over here I am on l
chunk's contents: ine 2

readLine line:    Over here I am on line 2
main line:        Over here I am on line 2

chunk's contents: Line 3 here

readLine line:    Line 3 here
main line:        Line 3 here

chunk's contents: Look out for 4

readLine line:    Look out for 4
main line:        Look out for 4

chunk's contents: Johnny 5 alive!

readLine line:    Johnny 5 alive!
main line:        Johnny 5 alive!

I think I have the problem narrowed down to the strncat(line, chunk, strlen(chunk) + 1); line. I don't understand why the preceding lines are being included when my BUFFER_SIZE is sufficiently high.

Upvotes: 0

Views: 122

Answers (3)

Daniel Fischer
Daniel Fischer

Reputation: 183968

line = realloc(line, ++count * BUFFER_SIZE * sizeof(char));

does not initialise the allocated memory. So if the first realloc in readLine gives you back the chunk of memory you got in the previous invocation - not impossible, you may have the old contents still in there.

Anyway, with uninitialised memory, the first strncat may invoke undefined behaviour, since there need not be a 0 byte in the allocated memory.

Allocate a buffer to line before entering the loop, and write a 0 to the first byte.

Also, don't use

line = realloc(line, ++count * BUFFER_SIZE * sizeof(char));

If realloc fails, you leak memory. And you should check the return value of realloc,

char *temp = realloc(line, ++count * BUFFER_SIZE * sizeof(char));
if (temp == NULL) {
  // Oops
} else {
    line = temp;
}

And don't malloc to chunk in the fgets call,

while (fgets((chunk = malloc(sizeof(char) * BUFFER_SIZE)), BUFFER_SIZE, fp) != NULL)

If malloc fails, that invokes undefined behaviour too. malloc and check before calling fgets,

while ((chunk = malloc(sizeof(char) * BUFFER_SIZE)) && fgets(chunk, BUFFER_SIZE, fp) != NULL)

Upvotes: 4

user2088639
user2088639

Reputation:

Your problem is here:

    line = realloc(line, ++count * BUFFER_SIZE * sizeof(char));    

According to the man page for realloc:

"realloc(3) does not guarantee that the additional memory is also
 zero-filled."

and

"If ptr is NULL, realloc() is identical to a call to malloc() for size bytes."

So any new memory you get is likely to be filled with non-zero bytes, meaning the first time it is called, it's probably not going to have a 0 for the first byte, which means that strncat is going to be appending to whatever junk bytes are in the allocation.

Upvotes: 1

bash.d
bash.d

Reputation: 13207

You can stick with realloc and memset your buffers to zero before, though.

Upvotes: 1

Related Questions