David K.
David K.

Reputation: 477

C: strcat() terminates program without error

I'm writing C-Code on a windows machine. This is my first serious C program so I might not know a lot of the vocabulary.

I'm trying to write a program that reads the characters from a text file and puts them in a string.

# include <stdio.h>
# include <string.h>
# define MAXCHAR 10

char* load_html(char* filename) {
  FILE *file;

  file = fopen(filename, "r");
  if (file == NULL) {
    printf("File not found %s", filename);
    return NULL;
  }

  char str[MAXCHAR];
  char* html= "";

  while (fgets(str, MAXCHAR, file) != NULL) {
    printf(str);
    //strcat(html, str);
  }

  return html;
}

int main() {
  char* filename = "load_html.c";
  load_html(filename);

  return 0;
}

When I compile (gcc -o load_html.exe .\load_html.c) and run this piece of code it runs perfectly fine and prints the source code of this program to the console. However if I uncomment strcat

  while (fgets(str, MAXCHAR, file) != NULL) {
    printf(str);
    strcat(html, str);
  }

the program will read the first line of the file, pause for 1 to 2 seconds and then exit without an error.

What exactly is going on here? I feel like I'm missing something very important.

Upvotes: 1

Views: 511

Answers (3)

anastaciu
anastaciu

Reputation: 23832

html is a pointer to a string literal, these can't be changed (and are normaly stored in a read-only section of memory), which is what strcat tries to do, invoking undefined behavior in the process.

Even if that wasn't the case, html is clearly too small to take anything else as it only has space for 1 character.

It should be:

char html[SIZE] = "";

Where SIZE must be big enough to take all of the concatenated strings.

In this case you have a problem as you are returning html, if it's not a pointer it will be a local variable who's lifetime will expire as the function returns. You can work this out by:


  1. Keeping html as a pointer and allocating memory to it:
#include <stdlib.h>
//...
char *html = malloc(SIZE);

You then need to free(html) when you're done with it.


  1. Or passing a pointer to char as argument of the function:
void load_html(char* filename, char* html){ //void return type

    //remove declaration of html
    //do not return anything
}

And in main:

int main(){

    char* filename = "load_html.c";
    char html[size]; //buffer to store the concatenated string
    load_html(filename, html); //the string will be stored in the buffer you pass as an argument

}

I would prefer this second option as you don't need to allocate memory which is a more expensive method and forces you to manage the memory manually.


printf(str) is also ill-formed, this function needs the format specifier to print formatted output:

printf("%s" str);

Or use simply puts(str).

Upvotes: 2

Vlad from Moscow
Vlad from Moscow

Reputation: 311108

In this declaration

char* html= "";

you declared a pointer to the string literal "".

Then in this statement

strcat(html, str);

you are trying to change the pointed string literal.

However you may not change a string literal. According to the C Standard (6.4.5 String literals)

7 It is unspecified whether these arrays are distinct provided their elements have the appropriate values. If the program attempts to modify such an array, the behavior is undefined.

So if you want to accumulate strings read from a file you need to define an enough large character array.

For example

char html[MAXCHAR * MAXCHAR];
html[0] = '\0';

But in this case one more problem arises because you may not return a local array from a function that will not be alive after exiting the function.

So a more flexible and correct approach is to reallocate a character array dynamically in the while loop for each new string read from the file.

Something like

char *htmp = calloc( 1, sizeof( char ) );
size_t n = 1;

while (fgets(str, MAXCHAR, file) != NULL) {
    printf(str);

    n += strlen( str );

    char *tmp = realloc( html, n );

    if ( tmp == NULL ) break;

    html = tmp;

    strcat(html, str);
}

// ...

return html;

And in main you should free the allocated memory when the array is not required anymore.

free( html );

Upvotes: 2

Paul Hankin
Paul Hankin

Reputation: 58369

html is a pointer to 1 byte of memory. That memory is read-only because it's a statically allocated string. strcat overwrites it, so is undefined behavior for two reasons: writing memory out of bounds, and writing to memory that is read-only.

Any weird behavior you see (eg: your program exiting early) may be caused by that.

Upvotes: 0

Related Questions