Reputation: 289
I am trying to write a function to convert a text file into a CSV file. The input file has 3 lines with space-delimited entries. I have to find a way to read a line into a string and transform the three lines from the input file to three columns in a CSV file.
The files look like this :
Jake Ali Maria
24 23 43
Montreal Johannesburg Sydney
And I have to transform it into something like this:
Jake, 24, Montreal
...etc
I figured I could create a char **line
variable that would hold three references to three separate char
arrays, one for each of the three lines of the input file. I.e., my goal is to have *(line+i)
store the i+1
'th line of the file.
I wanted to avoid hardcoding char
array sizes, such as
char line1 [999];
fgets(line1, 999, file);
so I wrote a while loop to fgets
pieces of a line into a small buffer array of predetermined size, and then strcat
and realloc
memory as necessary to store the line as a string, with *(line+i)
as as pointer to the string, where i
is 0
for the first line, 1
for the second, etc.
Here is the problematic code:
#include <stdio.h>
#include<stdlib.h>
#include<string.h>
#define CHUNK 10
char** getLines (const char * filename){
FILE *file = fopen(filename, "rt");
char **lines = (char ** ) calloc(3, sizeof(char*));
char buffer[CHUNK];
for(int i = 0; i < 3; i++){
int lineLength = 0;
int bufferLength = 0;
*(lines+i) = NULL;
do{
fgets(buffer, CHUNK, file);
buffLength = strlen(buffer);
lineLength += buffLength;
*(lines+i) = (char*) realloc(*(lines+i), (lineLength +1)*sizeof(char));
strcat(*(lines+i), buffer);
}while(bufferLength ==CHUNK-1);
}
puts(*(lines+0));
puts(*(lines+1));
puts(*(lines+2));
fclose(file);
}
void load_and_convert(const char* filename){
char ** lines = getLines(filename);
}
int main(){
const char* filename = "demo.txt";
load_and_convert(filename);
}
This works as expected only for i=0
. However, going through this with GDB, I see that I get a realloc(): invalid pointer
error. The buffer loads fine, and it only crashes when I call 'realloc' in the for loop for i=1
, when I get to the second line.
I managed to store the strings like I wanted in a small example I did to try to see what was going on, but the inputs were all on the same line. Maybe this has to do with fgets
reading from a new line?
I would really appreciate some help with this, I've been stuck all day.
Thanks a lot!
***edit
I tried as suggested to use calloc
instead of malloc
to initialize the variable **lines
, but I still have the same issue.I have added the modifications to the original code I uploaded.
***edit
After deleting the file and recompiling, the above now seems to work. Thank you to everyone for helping me out!
Upvotes: 1
Views: 3239
Reputation: 18420
Here, you have to zero your array of pointers (for example by using calloc()
),
char **line = (char**)malloc(sizeof(char*)*3); //allocate space for three char* pointers
otherwise the reallocs
*(line+i) = (char *)realloc(*(line+i), (inputLength+1)*sizeof(char)); //+1 for the empty character
use an uninitialized pointer, leading to undefined behaviour.
That it works with i=0
is pure coindicence and is a typical pitfall when encountering UB.
Furthermore, when using strcat()
, you have to make sure that the first parameter is already a zero-terminated string! This is not the case here, since at the first iteration, realloc(NULL, ...);
leaves you with an uninitialized buffer. This can lead to strcpy()
writing past the end of your allocated buffer and lead to heap corruption. A possible fix is to use strcpy()
instead of strcat()
(this should even be more efficient here):
do{
fgets(buffer, CHUNK, file);
buffLength = strlen(buffer);
lines[i] = realloc(lines[i], (lineLength + buffLength + 1));
strcpy(lines[i]+lineLength, buffer);
lineLength += buffLength;
}while(bufferLength ==CHUNK-1);
The check bufferLength == CHUNK-1
will not do what you want if the line (including the newline) is exactly CHUNK-1
bytes long. A better check might be while (buffer[buffLength-1] != '\n')
.
Btw. line[i]
is by far better readable than *(line+i)
(which is semantically identical).
Upvotes: 1
Reputation: 90175
You allocate line
(which is a misnomer since it's not a single line), which is a pointer to three char*
s. You never initialize the contents of line
(that is, you never make any of those three char*
s point anywhere). Consequently, when you do realloc(*(line + i), ...)
, the first argument is uninitialized garbage.
To use realloc
to do an initial memory allocation, its first argument must be a null pointer. You should explicitly initialize each element of line
to NULL
first.
Additionally, *(line+i) = (char *)realloc(*(line+i), ...)
is still bad because if realloc
fails to allocate memory, it will return a null pointer, clobber *(line + i)
, and leak the old pointer. You instead should split it into separate steps:
char* p = realloc(line[i], ...);
if (p == null) {
// Handle failure somehow.
exit(1);
}
line[i] = p;
A few more notes:
malloc
/realloc
/calloc
. It's not necessary since C allows implicit conversion from void*
to other pointer types, and the explicit could mask an error where you accidentally omit #include <stdlib.h>
.sizeof(char)
is, by definition, 1 byte.T* p = malloc(n * sizeof *p);
instead of T* p = malloc(n * sizeof (T));
. That way if the type of p
ever changes, you won't silently be allocating the wrong amount of memory if you neglect to update the malloc
(or realloc
or calloc
) call.Upvotes: 2