dav
dav

Reputation: 936

strcat() causes segmentation fault only after program is finished?

Here's a program that summarizes text. Up to this point, I'm counting the number of occurrences of each word in the text. But, I'm getting a segmentation fault in strcat.

Program received signal SIGSEGV, Segmentation fault.
0x75985629 in strcat () from C:\WINDOWS\SysWOW64\msvcrt.dll

However, while stepping through the code, the program runs the strcat() function as expected. I don't receive the error until line 75, when the program ends.

#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <ctype.h>

#define MAXTEXT 1000
#define MAXLINE 200
#define MAXWORDS 200
#define MAXWORD 32

char *strtolower(char *d, const char *s, size_t len);

/* summarizer: summarizes text */
int main(int argc, char *argv[]) {
    /* argument check */
    char *prog = argv[0];
    if (argc < 1) {
        fprintf(stderr, "%s: missing arguments, expected 1", prog);
        exit(1);
    }
    /* attempt to open file */
    FILE *fp;
    if (!(fp = fopen(argv[1], "r"))) {
        fprintf(stderr, "%s: Couldn't open file %s", prog, argv[1]);
        exit(2);
    }
    /* read file line by line */
    char line[MAXLINE], text[MAXTEXT];
    while ((fgets(line, MAXTEXT, fp))) {
        strncat(text, line, MAXLINE); 
    }
    /* separate into words and count occurrences */
    struct wordcount {
        char *word;
        int count;
    };
    struct wordcount words[MAXWORDS];
    char *token, *delim = " \t\n.,!?";
    char word[MAXWORD], textcpy[strlen(text)+1]; /*len of text and \0 */
    int i, j, is_unique_word = 1;
    strcpy(textcpy, text);
    token = strtok(textcpy, delim);
    for (i = 0; i < MAXWORDS && token; i++) {
        strtolower(word, token, strlen(token));
        token = strtok(NULL, delim);
        /* check if word exists */
        for (j = 0; words[j].word && j < MAXWORDS; j++) {
            if (!strcmp(word, words[j].word)) {
                is_unique_word = 0;
                words[j].count++;
            }
        }
        /* add to word list of unique */
        if (is_unique_word) {
            strcpy(words[i].word, word);
            words[i].count++;
        }
        is_unique_word = 1;
    }

    return 0;
}

/* strtolower: copy str s to dest d, returns pointer of d */
char *strtolower(char *d, const char *s, size_t len) {
    int i;

    for (i = 0; i < len; i++) {
        d[i] = tolower(*(s+i));
    }

    return d;
}

Upvotes: 1

Views: 257

Answers (3)

chqrlie
chqrlie

Reputation: 144780

The problem is in the loop: while ((fgets(line, MAXTEXT, fp))) strncat(text, line, MAXLINE);. It is incorrect for multiple reasons:

  • text is uninitialized, concatenating a string to it has undefined behavior. Undefined behavior may indeed cause a crash after the end of the function, for example if the return address was overwritten.
  • there is no reason to use strncat() with a length of MAXLINE, the string read by fgets() has at most MAXLINE-1 bytes.
  • you do not check if there is enough space at the end of text to concatenate the contents of line. strncat(dest, src, n) copies at most n bytes from src to the end of dest and always sets a null terminator. It is not a safe version of strcat(). If the line does not fit at the end of text, you have unexpected behavior, and again you can observe a crash after the end of the function, for example if the return address was overwritten.

You could just try and read the whole file with fread:

/* read the file into the text array */
char text[MAXTEXT];
size_t text_len = fread(text, 1, sizeof(text) - 1, fp);
text[text_len] = '\0';

If text_len == sizeof(text) - 1, the file is potentially too large for the text array and the while loop would have caused a buffer overflow.

Upvotes: 1

haccks
haccks

Reputation: 106022

Destination string of strncat function shall be null terminated. You need to null terminate text before passing it to strncat function. You also have to write only upto MAXLINE-1 bytes and leave a space for '\0' appended by strncat at the end to stop buffer overflow.

char line[MAXLINE], text[MAXTEXT] = {'\0'};  
while ((fgets(line, MAXTEXT, fp)))
{
    strncat(text, line, MAXLINE-1); 
}

Upvotes: 1

bartoli
bartoli

Reputation: 173

There is at least one problem because you create line with MAXLINE size (200), then you fgets() up to MAXTEXT (1000) chars into it.

Upvotes: 1

Related Questions