Reputation: 1060
I am seeing some odd behavior out of the strncat
function in the string.h
standard library, and would like some help understanding what is going on.
The crux of my issue is in a function I created called readLine
with the purpose of returning the lines of a file as char *
strings without a trailing newline terminator. That function looks like this:
char * readLine(FILE * fp) {
char * chunk = NULL;
char * line = NULL;
int count = 0;
// iterate through chunks of a line until we reach the end (or an error)
while (fgets((chunk = malloc(sizeof(char) * BUFFER_SIZE)), BUFFER_SIZE, fp) != NULL) {
// realloc on a null pointer works like malloc
line = realloc(line, ++count * BUFFER_SIZE * sizeof(char));
printf("chunk's contents: %s\n", chunk);
// does chunk contain the end of a line?
if(strchr(chunk, '\n') == NULL) {
// concatenate string parts and continue loop
strncat(line, chunk, strlen(chunk) + 1);
free(chunk);
} else {
// we want to return a \0 terminated string without the \n
// expected position of \n in chunk is ({length of chunk}-1)
chunk[(strlen(chunk) - 1)] = '\0';
// concatenate string parts
strncat(line, chunk, strlen(chunk) + 1);
printf("readLine line: %s\n", line);
free(chunk);
break;
}
}
return line;
}
I call it in main in a loop that looks like this:
FILE * fp = NULL;
if ((fp = fopen(FILE_PATH, "r")) != NULL) {
char * line = NULL;
while ((line = readLine(fp)) != NULL) {
printf("main line: %s\n\n", line);
free(line);
}
fclose(fp);
}
Now the odd behavior comes in my definition of #define BUFFER_SIZE 1000
. With it set like that I get the following output (which is not what I want):
chunk's contents: I am on line 1
readLine line: I am on line 1
main line: I am on line 1
chunk's contents: Over here I am on line 2
readLine line: I am on line 1Over here I am on line 2
main line: I am on line 1Over here I am on line 2
chunk's contents: Line 3 here
readLine line: I am on line 1Over here I am on line 2Line 3 here
main line: I am on line 1Over here I am on line 2Line 3 here
chunk's contents: Look out for 4
readLine line: I am on line 1Over here I am on line 2Line 3 hereLook out for 4
main line: I am on line 1Over here I am on line 2Line 3 hereLook out for 4
chunk's contents: Johnny 5 alive!
readLine line: I am on line 1Over here I am on line 2Line 3 hereLook out for 4Johnny 5 alive!
main line: I am on line 1Over here I am on line 2Line 3 hereLook out for 4Johnny 5 alive!
BUT if i change that definition to something like #define BUFFER_SIZE 20
, I get the output I am looking for:
chunk's contents: I am on line 1
readLine line: I am on line 1
main line: I am on line 1
chunk's contents: Over here I am on l
chunk's contents: ine 2
readLine line: Over here I am on line 2
main line: Over here I am on line 2
chunk's contents: Line 3 here
readLine line: Line 3 here
main line: Line 3 here
chunk's contents: Look out for 4
readLine line: Look out for 4
main line: Look out for 4
chunk's contents: Johnny 5 alive!
readLine line: Johnny 5 alive!
main line: Johnny 5 alive!
I think I have the problem narrowed down to the strncat(line, chunk, strlen(chunk) + 1);
line. I don't understand why the preceding line
s are being included when my BUFFER_SIZE
is sufficiently high.
Upvotes: 0
Views: 122
Reputation: 183968
line = realloc(line, ++count * BUFFER_SIZE * sizeof(char));
does not initialise the allocated memory. So if the first realloc
in readLine
gives you back the chunk of memory you got in the previous invocation - not impossible, you may have the old contents still in there.
Anyway, with uninitialised memory, the first strncat
may invoke undefined behaviour, since there need not be a 0 byte in the allocated memory.
Allocate a buffer to line
before entering the loop, and write a 0
to the first byte.
Also, don't use
line = realloc(line, ++count * BUFFER_SIZE * sizeof(char));
If realloc
fails, you leak memory. And you should check the return value of realloc
,
char *temp = realloc(line, ++count * BUFFER_SIZE * sizeof(char));
if (temp == NULL) {
// Oops
} else {
line = temp;
}
And don't malloc
to chunk
in the fgets
call,
while (fgets((chunk = malloc(sizeof(char) * BUFFER_SIZE)), BUFFER_SIZE, fp) != NULL)
If malloc
fails, that invokes undefined behaviour too. malloc
and check before calling fgets
,
while ((chunk = malloc(sizeof(char) * BUFFER_SIZE)) && fgets(chunk, BUFFER_SIZE, fp) != NULL)
Upvotes: 4
Reputation:
Your problem is here:
line = realloc(line, ++count * BUFFER_SIZE * sizeof(char));
According to the man page for realloc
:
"realloc(3) does not guarantee that the additional memory is also
zero-filled."
and
"If ptr is NULL, realloc() is identical to a call to malloc() for size bytes."
So any new memory you get is likely to be filled with non-zero bytes, meaning the first time it is called, it's probably not going to have a 0 for the first byte, which means that strncat is going to be appending to whatever junk bytes are in the allocation.
Upvotes: 1
Reputation: 13207
You can stick with realloc
and memset your buffers to zero before, though.
Upvotes: 1