Reputation:
I have the following two functions which takes an arrays of strings and makes them lowercase (in-place) --
#include <stdio.h>
#include <ctype.h>
#include <string.h>
#include <stdlib.h>
void to_lower(char ** strings) {
char * original_string;
char * lower_string;
for (int i=0; (original_string=strings[i]) != NULL; i++) {
lower_string = malloc(strlen(original_string + 1) * sizeof(char));
for (int j=0; j<=strlen(original_string); j++) {
lower_string[j] = tolower(original_string[j]);
}
strings[i]=lower_string;
}
}
int main(void) {
char * strings[] = {"Hello", "Zerotom", "new", NULL };
to_lower(strings);
to_lower(strings); // this and successive calls won't change
to_lower(strings); // anything but are here to try and understand malloc
to_lower(strings);
to_lower(strings);
return 0;
}
At the beginning of the main
function before to_lower
is called, how much memory is consumed? My guess was 16 bytes from the array of chars (15 chars + 1 null byte at end).
After to_lower
has run 5 times and before the function is returned, how much memory has been consumed? Where should I be "free"-ing the strings that are being passed into the function (as my thought was calling malloc
every time a string is copied/lower-cased it creates that much additional memory but never frees anything.
Does the to_lower
function look ok, or how can it be modified so that it doesn't leak memory if it currently is?
Upvotes: 0
Views: 139
Reputation: 153517
As an alternative to {"Hello", "Zerotom", "new", NULL };
and malloc()
and friends, form the array of pointers char * strings[]
to be initialized with pointers to modifiable data.
Since C99, use compound literals.
void inplace_strtolower(char * s) {
while (*s) {
*s = (char) tolower((unsigned char ) *s);
s++;
}
}
// "Non-const string literal"
// Compound literal from string literal"
#define NCSL(sl) ((char[sizeof(sl)]) { sl })
int main(void) {
char * strings[] = {NCSL("Hello"), NCSL("Zerotom"), NCSL("new"), NULL};
inplace_strtolower(strings[0]);
inplace_strtolower(strings[1]);
inplace_strtolower(strings[2]);
puts(strings[0]);
puts(strings[1]);
puts(strings[2]);
return 0;
}
Output
hello
zerotom
new
Upvotes: 0
Reputation: 84559
You are wrapping yourself around the axle (confusing yourself) by combining the allocation and conversion to lower into a single void to_lower(char ** strings)
function. Just as you have found, if you want to call the function twice on the same object, you have to free
the memory you allocated to hold the lowercase strings in between calls -- but then you have lost the pointers to your original strings..
While there is nothing wrong with combining multiple operations in a function, you do have to step back and make sure what you are doing makes sense and won't cause more problems than it solves.
Your allocation and copying of strings
is required before you modify the strings contained in strings
because you initialize strings
as an array of pointers to string-literals. String-literal are immutable (on all but a very few systems) created in read-only memory (generally the .rodata
section of the executable). Attempting to modify a string-literal will just about guarantee a SegFault (on all but the quirky few systems)
Further, how are you going to get the original strings back if you have already overwritten the pointer addresses to the originals with pointers to allocated memory holding the lower-case results? (not to mention the tracking headache you create by replacing pointers to literals with pointers to allocated memory as far as when it is okay to free
those pointers).
Here it is far better to leave you original strings
untouched and simply allocate a copy of the original (by allocating pointers, including one for the sentinel value, and allocating storage for each of the original strings and then copy the original strings before converting to lowercase. This solves your problem with memory leaks as well as the problem with losing your original pointers to the string literals. You can free the lowercase strings as needed and you can always make another copy of the originals to send to your conversion function again.
So how would you go about implementing this? The easiest way is just to declare a pointer to pointer to char (e.g. a double-pointer) to which you can assign a block of memory for any number of pointers you like. In this case just allocate the same number of pointers as you have in the strings
array, e.g.:
char *strings[] = {"Hello", "Zerotom", "new", NULL };
size_t nelem = *(&strings + 1) - strings; /* number of pointers */
/* declare & allocate nelem pointers */
char **modstrings = malloc (nelem * sizeof *modstrings);
if (!modstrings) { /* validate EVERY allocation */
perror ("malloc-modstrings");
}
(note: you can also use sizeof strings / sizeof *strings
to obtain the number of elements)
Now that you have modstrings
assigned a block of memory holding the same number of pointers as you have in strings
, you can simply allocate blocks of memory sufficient to hold each of the string literals and assign the starting address for each block to successive pointers in modstrings
setting the last pointer NULL
as your sentinel, e.g.
void copy_strings (char **dest, char * const *src)
{
while (*src) { /* loop over each string */
size_t len = strlen (*src); /* get length */
if (!(*dest = malloc (len + 1))) { /* allocate/validate */
perror ("malloc-dest"); /* handle error */
exit (EXIT_FAILURE);
}
memcpy (*dest++, *src++, len + 1); /* copy *src to *dest (advance) */
}
*dest = NULL; /* set sentinel NULL */
}
(note: by passing the src
parameter as char * const *src
instead of just char **src
, you can indicate to the compiler that src
will not be changed allowing further optimizations by the compiler. restrict
would be similar, but that discussion is left to another day)
Your to_lower
function then reduces to:
void to_lower (char **strings) {
while (*strings) /* loop over each string */
for (char *p = *strings++; *p; p++) /* loop over each char */
*p = tolower (*p); /* convert to lower */
}
As a convenience, since you know you will want to copy strings
to modstrings
before each call to to_lower
, you can combine both functions into a single wrapper (that does make sense to combine), e.g.
void copy_to_lower (char **dest, char * const *src)
{
copy_strings (dest, src); /* just combine functions above into single */
to_lower (dest);
}
(you could even add the print_array
and free_strings
above as well if you always want to do those operations in a single call as well -- more later)
In between each copy_to_lower
and print_array
of modstrings
you will need to free the storage assigned to each pointer so you do not leak memory when you call copy_to_lower
again. A simple free_strings
function could be:
void free_strings (char **strings)
{
while (*strings) { /* loop over each string */
free (*strings); /* free it */
*strings++ = NULL; /* set pointer NULL (advance to next) */
}
}
You can now allocate, copy, convert to lower, print, and free as many times as you like in main()
. you would simply make repeated calls to:
copy_to_lower (modstrings, strings); /* copy_to_lower to modstrings */
print_array (modstrings); /* print modstrings */
free_strings (modstrings); /* free strings (not pointers) */
copy_to_lower (modstrings, strings); /* ditto */
print_array (modstrings);
free_strings (modstrings);
...
Now recall, you are freeing the storage for each string when you call free_strings
, but you are leaving the block of memory containing the pointers assigned to modstrings
. So to complete your free of all memory you have allocated, don't forget to free the pointers, e.g.
free (modstrings); /* now free pointers */
Putting it altogether into an example you could do the following:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <ctype.h>
void print_array (char **strings)
{
while (*strings)
printf ("%s, ", *strings++);
putchar ('\n');
}
void free_strings (char **strings)
{
while (*strings) { /* loop over each string */
free (*strings); /* free it */
*strings++ = NULL; /* set pointer NULL (advance to next) */
}
}
void copy_strings (char **dest, char * const *src)
{
while (*src) { /* loop over each string */
size_t len = strlen (*src); /* get length */
if (!(*dest = malloc (len + 1))) { /* allocate/validate */
perror ("malloc-dest"); /* handle error */
exit (EXIT_FAILURE);
}
memcpy (*dest++, *src++, len + 1); /* copy *src to *dest (advance) */
}
*dest = NULL; /* set sentinel NULL */
}
void to_lower (char **strings) {
while (*strings) /* loop over each string */
for (char *p = *strings++; *p; p++) /* loop over each char */
*p = tolower (*p); /* convert to lower */
}
void copy_to_lower (char **dest, char * const *src)
{
copy_strings (dest, src); /* just combine functions above into single */
to_lower (dest);
}
int main(void) {
char *strings[] = {"Hello", "Zerotom", "new", NULL };
size_t nelem = *(&strings + 1) - strings; /* number of pointers */
/* declare & allocate nelem pointers */
char **modstrings = malloc (nelem * sizeof *modstrings);
if (!modstrings) { /* validate EVERY allocation */
perror ("malloc-modstrings");
}
copy_to_lower (modstrings, strings); /* copy_to_lower to modstrings */
print_array (modstrings); /* print modstrings */
free_strings (modstrings); /* free strings (not pointers) */
copy_to_lower (modstrings, strings); /* ditto */
print_array (modstrings);
free_strings (modstrings);
copy_to_lower (modstrings, strings); /* ditto */
print_array (modstrings);
free_strings (modstrings);
copy_to_lower (modstrings, strings); /* ditto */
print_array (modstrings);
free_strings (modstrings);
free (modstrings); /* now free pointers */
}
Example Use/Output
$ ./bin/tolower_strings
hello, zerotom, new,
hello, zerotom, new,
hello, zerotom, new,
hello, zerotom, new,
Memory Use/Error Check
In any code you write that dynamically allocates memory, you have 2 responsibilities regarding any block of memory allocated: (1) always preserve a pointer to the starting address for the block of memory so, (2) it can be freed when it is no longer needed.
It is imperative that you use a memory error checking program to insure you do not attempt to access memory or write beyond/outside the bounds of your allocated block, attempt to read or base a conditional jump on an uninitialized value, and finally, to confirm that you free all the memory you have allocated.
For Linux valgrind
is the normal choice. There are similar memory checkers for every platform. They are all simple to use, just run your program through it.
$ valgrind ./bin/tolower_strings
==5182== Memcheck, a memory error detector
==5182== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==5182== Using Valgrind-3.12.0 and LibVEX; rerun with -h for copyright info
==5182== Command: ./bin/tolower_strings
==5182==
hello, zerotom, new,
hello, zerotom, new,
hello, zerotom, new,
hello, zerotom, new,
==5182==
==5182== HEAP SUMMARY:
==5182== in use at exit: 0 bytes in 0 blocks
==5182== total heap usage: 13 allocs, 13 frees, 104 bytes allocated
==5182==
==5182== All heap blocks were freed -- no leaks are possible
==5182==
==5182== For counts of detected and suppressed errors, rerun with: -v
==5182== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
Always confirm that you have freed all memory you have allocated and that there are no memory errors.
Now this is a long post, but you have made progress on learning dynamic allocation. It will take you a while to digest it all, but if you have further questions just let me know.
Upvotes: 1
Reputation: 41017
Does the to_lower function look ok, or how can it be modified so that it doesn't leak memory if it currently is?
As pointed out by @chux in comments, you need to add 1 to the len of orginal_string
, not to the pointer itself.
Regarding your question, yes, it leaks, each call to malloc
wants a call to free
. The problem here is: you are not allowed to call free
on the initial values since they are filled with string-literals.
A posible solution is:
extern char *strdup(const char *);
static char *dup(const char *str)
{
char *res = strdup(str);
if (res == NULL) {
perror("strdup");
exit(EXIT_FAILURE);
}
return res;
}
int main(void)
{
char *strings[] = {
dup("Hello"),
dup("Zerotom"),
dup("new"),
NULL
};
...
Now you can call to_lower
and you don't need to malloc
inside the function, just call free
for each element at the very end when the array is no longer needed.
Notice that strdup
is not part of the standard (but is avaliable on many implementations)
Upvotes: 1
Reputation: 186
Every time to_lower()
is called, you are replacing all string literals with dynamic memory pointers.
If you are calling to_lower() again without freeing existing memory, there is a memory leak.
lower_string = malloc(strlen(original_string + 1) * sizeof(char));
for (int j=0; j<=strlen(original_string); j++) {
lower_string[j] = tolower(original_string[j]);
}
strings[i]=lower_string;
So that when strings[]
array is no longer needed, you should free all its memory.
E.g.
for (int i=0; strings[i] != NULL; i++) {
free(strings[i]);
strings[i] = NULL;
}
Upvotes: 0