Devan Buggay
Devan Buggay

Reputation: 2759

Reading a file character by character in C

I'm writing a BF interpreter in C and I've run into a problem reading files. I used to use scanf in order to read the first string, but then you couldn't have spaces or comments in your BF code.

Right now here is what I have.

char *readFile(char *fileName)
{
  FILE *file;
  char *code = malloc(1000 * sizeof(char));
  file = fopen(fileName, "r");
  do 
  {
    *code++ = (char)fgetc(file);

  } while(*code != EOF);
  return code;
}

I know the problem arises in how I'm assigning the next char in the file to the code pointer but I'm just not sure what that is.
My pointer knowledge is lacking which is the point of this exercise. The interpreter works fine, all using pointers, I'm just having a problem reading files in to it.

(I'm going to implement only reading +-><[]., into the file later, although if anyone has a good way to do it, it would be great if you'd let me know!)

Upvotes: 40

Views: 258811

Answers (7)

Chris Lutz
Chris Lutz

Reputation: 75389

The problem here is twofold

  • a) you increment the pointer before you check the value read in, and
  • b) you ignore the fact that fgetc() returns an int instead of a char.

The first is easily fixed:

char *orig = code; // the beginning of the array
// ...
do {
  *code = fgetc(file);
} while(*code++ != EOF);
*code = '\0'; // nul-terminate the string
return orig; // don't return a pointer to the end

The second problem is more subtle -fgetc returns an int so that the EOF value can be distinguished from any possible char value. Fixing this uses a temporary int for the EOF check and probably a regular while loop instead of do / while.

Upvotes: 1

dreamlax
dreamlax

Reputation: 95315

There are a number of things wrong with your code:

char *readFile(char *fileName)
{
    FILE *file;
    char *code = malloc(1000 * sizeof(char));
    file = fopen(fileName, "r");
    do 
    {
      *code++ = (char)fgetc(file);

    } while(*code != EOF);
    return code;
}
  1. What if the file is greater than 1,000 bytes?
  2. You are increasing code each time you read a character, and you return code back to the caller (even though it is no longer pointing at the first byte of the memory block as it was returned by malloc).
  3. You are casting the result of fgetc(file) to char. You need to check for EOF before casting the result to char.

It is important to maintain the original pointer returned by malloc so that you can free it later. If we disregard the file size, we can achieve this still with the following:

char *readFile(char *fileName)
{
    FILE *file = fopen(fileName, "r");
    char *code;
    size_t n = 0;
    int c;

    if (file == NULL)
        return NULL; //could not open file

    code = malloc(1000);

    while ((c = fgetc(file)) != EOF)
    {
        code[n++] = (char) c;
    }

    // don't forget to terminate with the null character
    code[n] = '\0';        

    return code;
}

There are various system calls that will give you the size of a file; a common one is stat.

Upvotes: 49

Justin
Justin

Reputation: 2142

Expanding upon the above code from @dreamlax

char *readFile(char *fileName) {
    FILE *file = fopen(fileName, "r");
    char *code;
    size_t n = 0;
    int c;

    if (file == NULL) return NULL; //could not open file
    fseek(file, 0, SEEK_END);
    long f_size = ftell(file);
    fseek(file, 0, SEEK_SET);
    code = malloc(f_size);

    while ((c = fgetc(file)) != EOF) {
        code[n++] = (char)c;
    }

    code[n] = '\0';        

    return code;
}

This gives you the length of the file, then proceeds to read it character by character.

Upvotes: 12

Mandrake
Mandrake

Reputation: 361

the file is being opened and not closed for each call to the function also

Upvotes: 3

caf
caf

Reputation: 239011

Here's one simple way to ignore everything but valid brainfuck characters:

#define BF_VALID "+-><[].,"

if (strchr(BF_VALID, c))
    code[n++] = c;

Upvotes: 3

Oliver Charlesworth
Oliver Charlesworth

Reputation: 272467

I think the most significant problem is that you're incrementing code as you read stuff in, and then returning the final value of code, i.e. you'll be returning a pointer to the end of the string. You probably want to make a copy of code before the loop, and return that instead.

Also, C strings need to be null-terminated. You need to make sure that you place a '\0' directly after the final character that you read in.

Note: You could just use fgets() to get the entire line in one hit.

Upvotes: 2

Prav
Prav

Reputation: 401

Either of the two should do the trick -

char *readFile(char *fileName)
{
  FILE *file;
  char *code = malloc(1000 * sizeof(char));
  char *p = code;
  file = fopen(fileName, "r");
  do 
  {
    *p++ = (char)fgetc(file);
  } while(*p != EOF);
  *p = '\0';
  return code;
}

char *readFile(char *fileName)
{
  FILE *file;
  int i = 0;
  char *code = malloc(1000 * sizeof(char));
  file = fopen(fileName, "r");
  do 
  {
    code[i++] = (char)fgetc(file);
  } while(code[i-1] != EOF);
  code[i] = '\0'
  return code;
}

Like the other posters have pointed out, you need to ensure that the file size does not exceed 1000 characters. Also, remember to free the memory when you're done using it.

Upvotes: 1

Related Questions