Reputation: 3
I have some code that was written to be as conservative as possible with memory use, so it does things like use realloc()
for building strings a character at a time instead of a one-time fixed-length buffer as is commonplace. The advantage I see to this is you don't ever over-use memory in environments where that's important and you don't have to worry about the buffer being too small for corner cases and writing some realloc()
code anyway to cover that. My question is if this thrashes memory horribly or can cause any other problems, or if perhaps the compiler optimizes this away.
Here is a sample program that reads a text file using no fixed-length buffers and it works fine. Why would you or would you not write code like this? Is it a patently bad idea?
// Compile with: gcc a.c -o a
#include <stdio.h>
#include <stdlib.h>
int main(int argc, char* argv[]) {
void *tmp;
char **array = NULL; // array of line strings
size_t len = 0; // length of a line
int i = 0; // line count
// Open file
FILE* f;
if (!(f = fopen("./test.txt", "r"))) {
printf("Err: open file\n");
return 1;
}
// Read file into memory
int chr;
while ((chr = fgetc(f)) != EOF) {
// Reserve mem for another line
if (len == 0) {
// add 2 -- 1 for null terminator, 1 cuz i starts at 0
if ((tmp = realloc(array, (i+2)*sizeof(*array))) == NULL) {
printf("Err: realloc\n");
return 1;
}
array = tmp;
array[i] = NULL;
array[i+1] = NULL;
}
// Increase mem for next char in the line
// add 2 -- 1 for null terminator, 1 cuz len starts at 0
// Q: Why is best practice to omit sizeof knowing that for char it is always 1... all other alloc's need to use sizeof so the inconsistency leads to mistakes
if ((tmp = realloc(array[i], (len+2)*sizeof(*(array[i])))) == NULL) {
// if ((tmp = realloc(array[i], len+2)) == NULL) {
printf("Err: realloc\n");
return 1;
}
array[i] = tmp;
array[i][len] = (char)chr;
array[i][len+1] = '\0';
len++;
// newline
if (chr == '\n') {
len = 0;
i++;
}
}
if (ferror(f)) {
printf("Err: read file\n");
return 1;
}
// Verify input
printf("\nRead:\n-----\n");
for (i=0; array[i] != NULL; i++) { // array is NULL terminated
printf("[%d] %s", i+1, array[i]);
free(array[i]);
}
free(array);
printf("\nDone\n");
fclose(f);
return 0;
}
Upvotes: 0
Views: 270
Reputation: 36620
As noted in comments, the issue with growing your dynamically allocated buffer one unit at a time is that you have to call realloc
every single time, and realloc
is potentially very expensive as it may involve copying the data currently allocated to a different place in memory, as opposed to "growing" into adjacent memory.
As such, you'd prefer to grow it as few times as possible. While the ideal growth rate is subject to some discussion, we can safely assume it's not +1, but closer to x2. This is straightforward. If you know how to grow the array by adding 1, you can multiply the size by 2.
What will also help you is taking this logic out of main
so that:
We'd probably start out with a struct like:
typedef struct mystr {
size_t len;
size_t cap;
char *str;
} mystr_t;
And then we'd create a bunch of functions to interact with values of this type.
mystr_t *mystr_new();
void mystr_free(mystr_t *s);
int mystr_full(mystr_t *s);
mystr_t *mystr_from_cstr(const char *cs);
mystr_t *mystr_grow(mystr_t *s);
mystr_t *mystr_cpy(mystr_t *dst, const char *src);
mystr_t *mystr_cat(mystr_t *dst, const char *src);
void mystr_debug(mystr_t *s);
mystr_t *mystr_optimize(mystr_t *s);
Now, writing a function to read a line from a file is pretty straightforward:
#define LINE_BUF_SIZE 1024
mystr_t *readline(FILE *fp) {
if (!fp) return NULL;
mystr_t *s = mystr_new();
if (!s) return NULL;
char buf[LINE_BUF_SIZE];
while (1) {
if (!fgets(buf, LINE_BUF_SIZE, fp)) break;
size_t len = strlen(buf);
if (buf[len - 1] == '\n') {
buf[len - 1] = '\0';
mystr_cat(s, buf);
break;
}
else {
mystr_cat(s, buf);
}
}
return s;
}
Now, reading a line from standard input in main
looks really easy.
int main(void) {
mystr_t *line = readline(stdin);
mystr_debug(line);
mystr_free(line);
}
Upvotes: 0
Reputation: 1
When you call malloc()
or realloc()
for the first time, he gonna allocated a big chunk of memory. That for future call and not asking for more memory at every single use.
realloc()
Before everything look behind the allocated block and see if he can extend the block with some free space. if he can't he malloc()
a new memory space for you.
That a good strategy, but that don't work that well in your case.
The problem here is you realloc()
multiple pointer.
Every time he malloc()
a new block for you, he need to copy the previous block in the new one. After that he free your first block and return the new one.
So when you try to realloc()
multiple pointer, your memory look like a mess and you just create some fragmentation.
Other thing, memory is really cheap now day's, you can allocated much more.
To compare with c++ container implementation, they alloc the space they need * 2, avoiding linear time to increase performance.
You lose a lot of time with all the realloc()
call's.
Upvotes: 0