Reputation:
When I compile my code, I get no errors. However, when I attempt to run it, I get a segmentation fault (core dumped). Here is my main:
void main(int argc, char *argv[]){
if(argc < 3){
return;
}
char *stop_list_name = argv[1];
char *doc_names[argc - 2];
int i;
for(i = 0; i < argc; i++){
doc_names[i] = argv[i];
}
//create the array of stop words
char *stopWords[50];
char *word;
int word_counter = 0;
FILE *fp;
fp = fopen(stop_list_name, "r");
if(fp != NULL){
while(!feof(fp)){
fscanf(fp, "%s", word);
stopWords[word_counter] = word;
word_counter++;
}
}
fclose(fp);
for(i = 0; stopWords[i] != '\0'; i++){
printf("%s", stopWords[i]);
}
}
I'm pretty sure something is wrong in my while
loop, but I don't exactly know what, or how to fix it.
After seeing the answers, I modified my code so it looks like this, but it still crashes. What's wrong now?
int main(int argc, char *argv[]){
if(argc < 3){
return;
}
char *stop_list_name = argv[1];
char *doc_names[argc - 2];
int i;
for(i = 2; i < argc; i++){
doc_names[i-2] = argv[i];
}
//create the array of stop words
enum {MAX_STOP_WORDS = 50};
char *stopWords[MAX_STOP_WORDS];
int word_counter = 0;
FILE *fp = fopen(stop_list_name, "r");
if(fp != NULL){
char word[64];
int i;
for(i = 0; i < MAX_STOP_WORDS && fscanf(fp, "%63s", word) == 1; i++){
stopWords[i] = strdup(word);
}
word_counter = i;
fclose(fp);
}
for(i = 0; stopWords[i] != '\0'; i++){
printf("%s", stopWords[i]);
}
}
Upvotes: 0
Views: 89
Reputation: 754450
One possible source of problems is:
char *doc_names[argc - 2];
int i;
for(i = 0; i < argc; i++){
doc_names[i] = argv[i];
}
You allocate space for argc-2
pointers and proceed to copy argc
pointers into that space. That's a buffer overflow (in this case, a stack overflow too). It can easily cause the trouble. A plausible fix is:
for (i = 2; i < argv; i++)
doc_names[i-2] = argv[i];
However, you really don't need to copy the argument list; you can just process the arguments from index 2 to the end. I note that the code shown doesn't actually use doc_names
, but the out-of-bounds assignment can still cause trouble.
You are not allocating space to read a word into, nor allocating new space for each stop word, nor do you ensure that you do not overflow the bounds of the array in which you're storing the words.
Consider using:
enum { MAX_STOP_WORDS = 50 };
char *stopWords[MAX_STOP_WORDS];
int word_counter = 0;
FILE *fp = fopen(stop_list_name, "r");
if (fp != NULL)
{
char word[64];
for (i = 0; i < MAX_STOP_WORDS && fscanf(fp, "%63s", word) == 1; i++)
stopWords[i] = strdup(word);
word_counter = i;
fclose(fp);
}
This diagnosed problem is definitely a plausible cause of your crash. I used i
(declared earlier in the code) in the loop because word_counter
makes the loop control line too long for SO.
Strictly, strdup()
is not a part of standard C, but it is a part of POSIX. If you don't have POSIX, you can write your own:
#include <stdlib.h>
#include <string.h>
char *strdup(const char *str)
{
size_t len = strlen(str) + 1;
char *result = malloc(len);
if (result != 0)
memmove(result, str, len);
return result;
}
You also have some other bad practices on display:
while (!feof(file))
is always wrong.main()
return in C and C++?fclose(fp)
if the fopen()
worked, so you need to move the fclose()
inside the if
statement body.There's one important and a couple of very minor problems in the amended code:
Your loop that prints the stop words depends on a null pointer (curiously spelled as '\0'
— it is a valid but unconventional spelling for a null pointer), but the initialization code doesn't set a null pointer.
There are (at least) two options for fixing that:
Add a null pointer:
for (i = 0; i < MAX_STOP_WORDS-1 && fscanf(fp, "%63s", word) == 1; i++)
stopWords[i] = strdup(word);
stopWords[i] = 0;
fclose(fp);
}
for (i = 0; stopWords[i] != '\0'; i++)
printf("%s\n", stopWords[i]);
Note that the upper bound is now MAX_STOP_WORDS - 1
.
Or you can use wordCount
instead of a condition:
for (i = 0; i < wordCount; i++)
printf("%s\n", stopWords[i]);
I'd choose the second option.
One reason for doing that is it avoids warnings about wordCount
being set and not used — a minor problem.
And doc_names
is also set but not used.
I worry about those because my default compiler options generate errors for unused variables — so the code doesn't compile until I fix it. That leads to:
#include <stdio.h>
#include <string.h>
int main(int argc, char *argv[])
{
if (argc < 3)
{
fprintf(stderr, "Usage: %s stop-words docfile ...\n", argv[0]);
return 1;
}
char *stop_list_name = argv[1];
char *doc_names[argc - 2];
int i;
for (i = 2; i < argc; i++)
{
doc_names[i - 2] = argv[i];
}
int doc_count = argc - 2;
// create the array of stop words
enum { MAX_STOP_WORDS = 50 };
char *stopWords[MAX_STOP_WORDS];
int word_counter = 0;
FILE *fp = fopen(stop_list_name, "r");
if (fp != NULL)
{
char word[64];
int i;
for (i = 0; i < MAX_STOP_WORDS && fscanf(fp, "%63s", word) == 1; i++)
stopWords[i] = strdup(word);
word_counter = i;
fclose(fp);
}
for (i = 0; i < word_counter; i++)
printf("stop word %d: %s\n", i, stopWords[i]);
for (i = 0; i < doc_count; i++)
printf("document %d: %s\n", i, doc_names[i]);
return 0;
}
And, given a stop words file containing:
help
able
may
can
it
should
do
antonym
prozac
and compiling it (source file sw19.c
, program sw19
) with:
$ gcc -O3 -g -std=c11 -Wall -Wextra -Wmissing-prototypes -Wstrict-prototypes \
> -Wold-style-definition -Werror sw19.c -o sw19
and running it as:
$ ./sw19 stopwords /dev/null
stop word 0: help
stop word 1: able
stop word 2: may
stop word 3: can
stop word 4: it
stop word 5: should
stop word 6: do
stop word 7: antonym
stop word 8: prozac
document 0: /dev/null
$
Upvotes: 3
Reputation: 53016
You are trying to store the scanned string to an uninitialized pointer,
fscanf(fp, "%s", word);
and word
, is not even initialized.
You could use a static buffer for that, just like this
char word[100];
if (fscanf(fp, "%99s", word) != 1)
word[0] = '\0'; /* ensure that `word' is nul terminated on input error */
Also, while (!feof(fp))
is wrong, because the EOF
marker wont be set until fscanf()
attempts to read past the end of the file, so the code will iterate one extra time. And in that case you would store the same word
twice.
Note that you will also need to allocate space for the array of pointers, maybe there you could use malloc()
.
Upvotes: 1