Reputation: 477
I'm writing C-Code on a windows machine. This is my first serious C program so I might not know a lot of the vocabulary.
I'm trying to write a program that reads the characters from a text file and puts them in a string.
# include <stdio.h>
# include <string.h>
# define MAXCHAR 10
char* load_html(char* filename) {
FILE *file;
file = fopen(filename, "r");
if (file == NULL) {
printf("File not found %s", filename);
return NULL;
}
char str[MAXCHAR];
char* html= "";
while (fgets(str, MAXCHAR, file) != NULL) {
printf(str);
//strcat(html, str);
}
return html;
}
int main() {
char* filename = "load_html.c";
load_html(filename);
return 0;
}
When I compile (gcc -o load_html.exe .\load_html.c
) and run this piece of code it runs perfectly fine and prints the source code of this program to the console. However if I uncomment strcat
while (fgets(str, MAXCHAR, file) != NULL) {
printf(str);
strcat(html, str);
}
the program will read the first line of the file, pause for 1 to 2 seconds and then exit without an error.
What exactly is going on here? I feel like I'm missing something very important.
Upvotes: 1
Views: 511
Reputation: 23832
html
is a pointer to a string literal, these can't be changed (and are normaly stored in a read-only section of memory), which is what strcat
tries to do, invoking undefined behavior in the process.
Even if that wasn't the case, html
is clearly too small to take anything else as it only has space for 1 character.
It should be:
char html[SIZE] = "";
Where SIZE
must be big enough to take all of the concatenated strings.
In this case you have a problem as you are returning html
, if it's not a pointer it will be a local variable who's lifetime will expire as the function returns. You can work this out by:
html
as a pointer and allocating memory to it:#include <stdlib.h>
//...
char *html = malloc(SIZE);
You then need to free(html)
when you're done with it.
void load_html(char* filename, char* html){ //void return type
//remove declaration of html
//do not return anything
}
And in main:
int main(){
char* filename = "load_html.c";
char html[size]; //buffer to store the concatenated string
load_html(filename, html); //the string will be stored in the buffer you pass as an argument
}
I would prefer this second option as you don't need to allocate memory which is a more expensive method and forces you to manage the memory manually.
printf(str)
is also ill-formed, this function needs the format specifier to print formatted output:
printf("%s" str);
Or use simply puts(str)
.
Upvotes: 2
Reputation: 311108
In this declaration
char* html= "";
you declared a pointer to the string literal ""
.
Then in this statement
strcat(html, str);
you are trying to change the pointed string literal.
However you may not change a string literal. According to the C Standard (6.4.5 String literals)
7 It is unspecified whether these arrays are distinct provided their elements have the appropriate values. If the program attempts to modify such an array, the behavior is undefined.
So if you want to accumulate strings read from a file you need to define an enough large character array.
For example
char html[MAXCHAR * MAXCHAR];
html[0] = '\0';
But in this case one more problem arises because you may not return a local array from a function that will not be alive after exiting the function.
So a more flexible and correct approach is to reallocate a character array dynamically in the while loop for each new string read from the file.
Something like
char *htmp = calloc( 1, sizeof( char ) );
size_t n = 1;
while (fgets(str, MAXCHAR, file) != NULL) {
printf(str);
n += strlen( str );
char *tmp = realloc( html, n );
if ( tmp == NULL ) break;
html = tmp;
strcat(html, str);
}
// ...
return html;
And in main you should free the allocated memory when the array is not required anymore.
free( html );
Upvotes: 2
Reputation: 58369
html
is a pointer to 1 byte of memory. That memory is read-only because it's a statically allocated string. strcat
overwrites it, so is undefined behavior for two reasons: writing memory out of bounds, and writing to memory that is read-only.
Any weird behavior you see (eg: your program exiting early) may be caused by that.
Upvotes: 0