Reputation: 96927
This question is a bit long due the source code, which I tried to simplify as much as possible. Please bear with me and thanks for reading along.
I have an application with a loop that runs potentially millions of times. Instead of several thousands to millions of malloc
/free
calls within that loop, I would like to do one malloc
up front and then several thousands to millions of realloc
calls.
But I'm running into a problem where my application consumes several GB of memory and kills itself, when I am using realloc
. If I use malloc
, my memory usage is fine.
If I run on smaller test data sets with valgrind
's memtest, it reports no memory leaks with either malloc
or realloc
.
I have verified that I am matching every malloc
-ed (and then realloc
-ed) object with a corresponding free
.
So, in theory, I am not leaking memory, it is just that using realloc
seems to consume all of my available RAM, and I'd like to know why and what I can do to fix this.
What I have initially is something like this, which uses malloc
and works properly:
Malloc code
void A () {
do {
B();
} while (someConditionThatIsTrueForMillionInstances);
}
void B () {
char *firstString = NULL;
char *secondString = NULL;
char *someOtherString;
/* populate someOtherString with data from stream, for example */
C((const char *)someOtherString, &firstString, &secondString);
fprintf(stderr, "first: [%s] | second: [%s]\n", firstString, secondString);
if (firstString)
free(firstString);
if (secondString)
free(secondString);
}
void C (const char *someOtherString, char **firstString, char **secondString) {
char firstBuffer[BUFLENGTH];
char secondBuffer[BUFLENGTH];
/* populate buffers with some data from tokenizing someOtherString in a special way */
*firstString = malloc(strlen(firstBuffer)+1);
strncpy(*firstString, firstBuffer, strlen(firstBuffer)+1);
*secondString = malloc(strlen(secondBuffer)+1);
strncpy(*secondString, secondBuffer, strlen(secondBuffer)+1);
}
This works fine. But I want something faster.
Now I test a realloc
arrangement, which malloc
-s only once:
Realloc code
void A () {
char *firstString = NULL;
char *secondString = NULL;
do {
B(&firstString, &secondString);
} while (someConditionThatIsTrueForMillionInstances);
if (firstString)
free(firstString);
if (secondString)
free(secondString);
}
void B (char **firstString, char **secondString) {
char *someOtherString;
/* populate someOtherString with data from stream, for example */
C((const char *)someOtherString, &(*firstString), &(*secondString));
fprintf(stderr, "first: [%s] | second: [%s]\n", *firstString, *secondString);
}
void C (const char *someOtherString, char **firstString, char **secondString) {
char firstBuffer[BUFLENGTH];
char secondBuffer[BUFLENGTH];
/* populate buffers with some data from tokenizing someOtherString in a special way */
/* realloc should act as malloc on first pass through */
*firstString = realloc(*firstString, strlen(firstBuffer)+1);
strncpy(*firstString, firstBuffer, strlen(firstBuffer)+1);
*secondString = realloc(*secondString, strlen(secondBuffer)+1);
strncpy(*secondString, secondBuffer, strlen(secondBuffer)+1);
}
If I look at the output of free -m
on the command-line while I run this realloc
-based test with a large data set that causes the million-loop condition, my memory goes from 4 GB down to 0 and the app crashes.
What am I missing about using realloc
that is causing this? Sorry if this is a dumb question, and thanks in advance for your advice.
Upvotes: 3
Views: 3536
Reputation: 54554
realloc
has to copy the contents from the old buffer to the new buffer if the resizing operation cannot be done in place. A malloc
/free
pair can be better than a realloc
if you don't need to keep around the original memory.
That's why realloc
can temporarily require more memory than a malloc
/free
pair. You are also encouraging fragmentation by continuously interleaving realloc
s. I.e., you are basically doing:
malloc(A);
malloc(B);
while (...)
{
malloc(A_temp);
free(A);
A= A_temp;
malloc(B_temp);
free(B);
B= B_temp;
}
Whereas the original code does:
while (...)
{
malloc(A);
malloc(B);
free(A);
free(B);
}
At the end of each of the second loop you have cleaned up all the memory you used; that's more likely to return the global memory heap to a clean state than by interleaving memory allocations without completely freeing all of them.
Upvotes: 8
Reputation: 215193
Using realloc
when you don't want to preserve the existing contents of the memory block is a very very bad idea. If nothing else, you'll waste lots of time duplicating data you're about to overwrite. In practice, the way you're using it, the resized blocks will not fit in the old space, so they get located at progressively higher and higher addresses on the heap, causing the heap to grow ridiculously.
Memory management is not easy. Bad allocation strategies lead to fragmentation, atrocious performance, etc. The best you can do is avoid introducing any more constraints than you absolutely have to (like using realloc
when it's not needed), free as much memory as possible when you're done with it, and allocate large blocks of associated data together in a single allocation rather than in small pieces.
Upvotes: 1
Reputation: 93690
You are expecting &(*firstString)
to be the same as firstString
, but in fact it is taking the address of the argument to your function rather than passing through the address of the pointers in A
. Thus every time you call you make a copy of NULL, realloc new memory, lose the pointer to the new memory, and repeat. You can easily verify this by seeing that at the end of A
the original pointers are still null.
EDIT: Well, it's an awesome theory, but I seem to be wrong on the compilers I have available to me to test.
Upvotes: 0