Reputation: 152
I have a question about the function getline()
, that seems to behave differently in two scenarios about the memory usage, as reported by valgrind
. I post the code of the two cases and explain the behaviors.
I hope somebody can point me in the right direction.
getline()
is called in a while loop, reading all the lines of a text file in a buffer. The buffer is then freed ONLY ONCE at the end of the loop: in this case valgrind
gives no errors (no leaks occur).
int main(int argc, char* argv[])
{
char* buffer = NULL;
size_t bufsize = 0;
ssize_t nbytes;
int counter = 0;
char error = 0;
FILE* input_fd = fopen(argv[1], "r");
while ((nbytes = getline(&buffer, &bufsize, input_fd)) != -1)
{
counter += 1;
}
free(buffer);
fclose(input_fd);
return 0;
}
The same loop calls a function that, in turn, calls getline()
, passing the same buffer. Again, the buffer is freed only once, at the end of the loop, but in this case valgrind
reports a memory leak. Indeed, making the program run and looking at RSS, I can see it increases as the loop goes on. Please note that, adding a free inside the loop (freeing the buffer every cycle) the problem disappears. Here's the code.
int my_getline(FILE* lf_fd, char** lf_buffer)
{
ssize_t lf_nbytes = 0;
size_t lf_bufsiz = 0;
lf_nbytes = getline(lf_buffer, &lf_bufsiz, lf_fd);
if (lf_nbytes == -1)
return 1;
return 0;
}
int main(int argc, char* argv[])
{
char* lf_buffer = NULL;
size_t bufsize = 0;
ssize_t nbytes;
int counter = 0;
int new_line_counter = 0;
char error = 0;
FILE* lf_fd = fopen(argv[1], "r");
while ((my_getline(lf_fd, &lf_buffer)) == 0)
{
// Added to allow measuring the RSS
sleep(2);
// If I uncomment this, no memory leak occurs
//free(lf_buffer);
}
free(lf_buffer);
fclose(lf_fd);
return 0;
}
==9604== Memcheck, a memory error detector
==9604== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==9604== Using Valgrind-3.15.0 and LibVEX; rerun with -h for copyright info
==9604== Command: ./my_getline_x86 /media/sf_Scambio/processes.log
==9604== HEAP SUMMARY:
==9604== in use at exit: 1,194 bytes in 2 blocks
==9604== total heap usage: 8 allocs, 6 frees, 11,242 bytes allocated
==9604==
==9604== 1,194 bytes in 2 blocks are definitely lost in loss record 1 of 1
==9604== at 0x483DFAF: realloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-
linux.so)
==9604== by 0x48E371D: getdelim (iogetdelim.c:102)
==9604== by 0x1092B3: my_getline (my_getline.c:14)
==9604== by 0x10956A: main (my_getline.c:38)
==9604==
==9604== LEAK SUMMARY:
==9604== definitely lost: 1,194 bytes in 2 blocks
==9604== indirectly lost: 0 bytes in 0 blocks
==9604== possibly lost: 0 bytes in 0 blocks
==9604== still reachable: 0 bytes in 0 blocks
==9604== suppressed: 0 bytes in 0 blocks
==9604==
==9604== For lists of detected and suppressed errors, rerun with: -s
==9604== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)
Upvotes: 4
Views: 993
Reputation: 52529
The first program is fine.
The issue with the second one comes from the buffer length argument to getline()
. Your my_getline()
always sets it to 0, meaning getline()
allocates a new buffer each time (At least, with the glibc implementation you're using; see below). Change it to
int my_getline(FILE* lf_fd, char** lf_buffer, size_t* lf_bufsiz)
{
ssize_t lf_nbytes = 0;
lf_nbytes = getline(lf_buffer, lf_bufsiz, lf_fd);
if (lf_nbytes == -1)
return 1;
return 0;
}
and pass a pointer to a size_t
variable originally initialized to 0 when using it. The existing bufsize
variable in main()
looks like it would be appropriate to use:
//...
while ((my_getline(lf_fd, &lf_buffer, &bufsize)) == 0)
// ...
While it was easy to work around, the memory leak you encountered appears to be a bug in the glibc implementation of getline()
.
From the POSIX documentation:
If
*lineptr
is a null pointer or if the object pointed to by*lineptr
is of insufficient size, an object shall be allocated as if bymalloc()
or the object shall be reallocated as if byrealloc()
, respectively, such that the object is large enough to hold the characters to be written to it...
and the glibc manpage:
Alternatively, before calling
getline()
,*lineptr
can contain a pointer to amalloc(3)
-allocated buffer*n
bytes in size. If the buffer is not large enough to hold the line,getline()
resizes it withrealloc(3)
, updating*lineptr
and*n
as necessary.
These suggest that, in the case you're running into, where you're passing a valid non-NULL
pointer to memory and saying it's 0 length, the function should be using realloc()
to resize it. However, the glibc implementation checks *lineptr == NULL || *n == 0
and if true, overwrites *lineptr
with a newly allocated buffer, causing the leak you saw. Compare the NetBSD implementation, which uses realloc()
for all allocation (realloc(NULL, x)
is equivalent to malloc(x)
), and thus won't cause a leak with your original code. It's not ideal because it causes a realloc()
on every use instead of just when the buffer isn't big enough to hold the current line (Unlike the fixed version above), but it works.
Upvotes: 8