user2877117
user2877117

Reputation:

What is the cause of my segment fault in C?

When I compile my code, I get no errors. However, when I attempt to run it, I get a segmentation fault (core dumped). Here is my main:

Original code

void main(int argc, char *argv[]){
    if(argc < 3){
          return;
    }

    char *stop_list_name = argv[1];
    char *doc_names[argc - 2];

    int i;
    for(i = 0; i < argc; i++){
            doc_names[i] = argv[i];
    }

//create the array of stop words
    char *stopWords[50];
    char *word;
    int word_counter = 0;
    FILE *fp;
    fp = fopen(stop_list_name, "r");
    if(fp != NULL){
            while(!feof(fp)){
                    fscanf(fp, "%s", word);
                    stopWords[word_counter] = word;
                    word_counter++;
            }
    }

    fclose(fp);

    for(i = 0; stopWords[i] != '\0'; i++){
            printf("%s", stopWords[i]);
    }
}

I'm pretty sure something is wrong in my while loop, but I don't exactly know what, or how to fix it.

Amended code

After seeing the answers, I modified my code so it looks like this, but it still crashes. What's wrong now?

int main(int argc, char *argv[]){
    if(argc < 3){
            return;
    }

    char *stop_list_name = argv[1];
    char *doc_names[argc - 2];

    int i;
    for(i = 2; i < argc; i++){
            doc_names[i-2] = argv[i];
    }

//create the array of stop words
    enum {MAX_STOP_WORDS = 50};
    char *stopWords[MAX_STOP_WORDS];
    int word_counter = 0;
    FILE *fp = fopen(stop_list_name, "r");
    if(fp != NULL){
            char word[64];
            int i;
            for(i = 0; i < MAX_STOP_WORDS && fscanf(fp, "%63s", word) == 1; i++){
                    stopWords[i] = strdup(word);
            }

            word_counter = i;
            fclose(fp);
    }

    for(i = 0; stopWords[i] != '\0'; i++){
            printf("%s", stopWords[i]);
    }
}

Upvotes: 0

Views: 89

Answers (2)

Jonathan Leffler
Jonathan Leffler

Reputation: 754450

Problems in the original code

One possible source of problems is:

char *doc_names[argc - 2];

int i;
for(i = 0; i < argc; i++){
        doc_names[i] = argv[i];
}

You allocate space for argc-2 pointers and proceed to copy argc pointers into that space. That's a buffer overflow (in this case, a stack overflow too). It can easily cause the trouble. A plausible fix is:

for (i = 2; i < argv; i++)
    doc_names[i-2] = argv[i];

However, you really don't need to copy the argument list; you can just process the arguments from index 2 to the end. I note that the code shown doesn't actually use doc_names, but the out-of-bounds assignment can still cause trouble.


You are not allocating space to read a word into, nor allocating new space for each stop word, nor do you ensure that you do not overflow the bounds of the array in which you're storing the words.

Consider using:

enum { MAX_STOP_WORDS = 50 };
char *stopWords[MAX_STOP_WORDS];
int word_counter = 0;
FILE *fp = fopen(stop_list_name, "r");
if (fp != NULL)
{
    char word[64];
    for (i = 0; i < MAX_STOP_WORDS && fscanf(fp, "%63s", word) == 1; i++)
        stopWords[i] = strdup(word);
    word_counter = i;
    fclose(fp);
}

This diagnosed problem is definitely a plausible cause of your crash. I used i (declared earlier in the code) in the loop because word_counter makes the loop control line too long for SO.

Strictly, strdup() is not a part of standard C, but it is a part of POSIX. If you don't have POSIX, you can write your own:

#include <stdlib.h>
#include <string.h>

char *strdup(const char *str)
{
    size_t len = strlen(str) + 1;
    char *result = malloc(len);
    if (result != 0)
        memmove(result, str, len);
    return result;
}

You also have some other bad practices on display:


Problems in the amended code

There's one important and a couple of very minor problems in the amended code:

  • Your loop that prints the stop words depends on a null pointer (curiously spelled as '\0' — it is a valid but unconventional spelling for a null pointer), but the initialization code doesn't set a null pointer.

    There are (at least) two options for fixing that:

    1. Add a null pointer:

         for (i = 0; i < MAX_STOP_WORDS-1 && fscanf(fp, "%63s", word) == 1; i++)
             stopWords[i] = strdup(word);
      
         stopWords[i] = 0;
         fclose(fp);
      }
      
      for (i = 0; stopWords[i] != '\0'; i++)
          printf("%s\n", stopWords[i]);
      

      Note that the upper bound is now MAX_STOP_WORDS - 1.

    2. Or you can use wordCount instead of a condition:

      for (i = 0; i < wordCount; i++)
          printf("%s\n", stopWords[i]);
      

    I'd choose the second option.

  • One reason for doing that is it avoids warnings about wordCount being set and not used — a minor problem.

  • And doc_names is also set but not used.

I worry about those because my default compiler options generate errors for unused variables — so the code doesn't compile until I fix it. That leads to:

#include <stdio.h>
#include <string.h>

int main(int argc, char *argv[])
{
    if (argc < 3)
    {
        fprintf(stderr, "Usage: %s stop-words docfile ...\n", argv[0]);
        return 1;
    }

    char *stop_list_name = argv[1];
    char *doc_names[argc - 2];

    int i;
    for (i = 2; i < argc; i++)
    {
        doc_names[i - 2] = argv[i];
    }
    int doc_count = argc - 2;

    // create the array of stop words
    enum { MAX_STOP_WORDS = 50 };
    char *stopWords[MAX_STOP_WORDS];
    int word_counter = 0;
    FILE *fp = fopen(stop_list_name, "r");
    if (fp != NULL)
    {
        char word[64];
        int i;
        for (i = 0; i < MAX_STOP_WORDS && fscanf(fp, "%63s", word) == 1; i++)
            stopWords[i] = strdup(word);

        word_counter = i;
        fclose(fp);
    }

    for (i = 0; i < word_counter; i++)
        printf("stop word %d: %s\n", i, stopWords[i]);

    for (i = 0; i < doc_count; i++)
        printf("document %d: %s\n", i, doc_names[i]);

    return 0;
}

And, given a stop words file containing:

help
able
may
can
it
should
do
antonym
prozac

and compiling it (source file sw19.c, program sw19) with:

$ gcc -O3 -g -std=c11 -Wall -Wextra -Wmissing-prototypes -Wstrict-prototypes \
>     -Wold-style-definition -Werror sw19.c -o sw19

and running it as:

$ ./sw19 stopwords /dev/null
stop word 0: help
stop word 1: able
stop word 2: may
stop word 3: can
stop word 4: it
stop word 5: should
stop word 6: do
stop word 7: antonym
stop word 8: prozac
document 0: /dev/null
$

Upvotes: 3

Iharob Al Asimi
Iharob Al Asimi

Reputation: 53016

You are trying to store the scanned string to an uninitialized pointer,

fscanf(fp, "%s", word);

and word, is not even initialized.

You could use a static buffer for that, just like this

char word[100];

if (fscanf(fp, "%99s", word) != 1)
    word[0] = '\0'; /* ensure that `word' is nul terminated on input error */

Also, while (!feof(fp)) is wrong, because the EOF marker wont be set until fscanf() attempts to read past the end of the file, so the code will iterate one extra time. And in that case you would store the same word twice.

Note that you will also need to allocate space for the array of pointers, maybe there you could use malloc().

Upvotes: 1

Related Questions