Yuehai
Yuehai

Reputation: 1243

Getting strange strings in C after reading getc from file

I am getting strange strings after the first iteration. I suspect it could be because of string termination, but I am not sure how to fix it. Or I might be using malloc the wrong way.

I am happy for any hints.

#include <stdio.h>
#include <memory.h>
#include <malloc.h>
#include <ctype.h>
#include "file_reader.h"

/**
 *  Opens a text file and reads the file. The text of the file is stored
    *  in memory in blocks of size blockSize. The linked list with the text is
    *  returned by the function. Each block should contain only complete words.
    *  If a word is split by the end of the block, the last letters should be
    *  moved into the next text block. Each text block must be NULL-terminated.
    *  If the reading of the file fails, the program should return a meaningful
    *  error message.
    */

int getFileSize(FILE* file) {
    FILE* endOfFile = file;
    fseek(endOfFile, 0, SEEK_END);
    long int size = ftell(file);
    fseek(file, 0, SEEK_SET);
    return (int) size;
}

LinkedList* read_text_file(const char* filename, int blockSize) {
    int globalByteCounter = 0;
    LinkedList*   list = LinkedList_create();
    int blockByteCounter;
    FILE* fp = fopen(filename, "r");
    int fileSize = getFileSize(fp);
    char* tokPointer = malloc(sizeof(getc(fp)));

    char* block = malloc(sizeof strcat("",""));

    //Loop for blocks in list
    while (globalByteCounter <= fileSize) {

        blockByteCounter = 0;
        char* word = malloc(sizeof(blockSize));

        //loop for each block
        while(blockByteCounter<blockSize) {
            char tok;

            //Building a word
            do {
                strcat(word, tokPointer);
                tok = (char) getc(fp);
                tokPointer=&tok;
                blockByteCounter++;
            }while (isalpha(tok));

            //Does this word still fit the block?
            if (blockByteCounter + strlen(word) < blockSize) {
                strcat(block, word);
                //Setze Wort zurück und füge Sonderzeicehen an
                word = strcpy(word,tokPointer);
            } else {
                strcpy(block,word);
            }
        }
        globalByteCounter += blockByteCounter;
        LinkedList_append(list, block);
        free(word);
    }
    LinkedList_append(list,block);
    fclose(fp);
    free(block);
    free(tokPointer);
    return list;
}

Upvotes: 0

Views: 247

Answers (1)

giusti
giusti

Reputation: 3538

There are multiple issues with the code. Let me tackle a few of them:

sizeof(getc(fp))

This is the same as applying sizeof on the return type of getc. In your case, what you are doing here is sizeof(int). That's not what you want.

Assuming that you have a text file, where the size of what you want to read is a number in ASCII, what you are looking for is the good old fscanf.

Similar here:

strcat("","")

but actually worse. strcat("a", "b") does not return "ab". It attempts to concatenate "b" onto "a" and returns the address of a, which is pretty bad because not only it doesn't do what you want, but also attempts to modify the string "a". You can't modify string literals.

blockByteCounter is not initialized.

And you got your hunch right:

char* word = malloc(sizeof(blockSize));

If you don't initialize word as an empty string, when you try to concatenate tokPointer onto it you'll run through a non-terminated string. Not only that, but tokPointer is also not initialized!

I'm also not sure why you are trying to use strcat to build a word. You don't need all those pointers. Once you know the required size of your buffer, you can 1) simply use fscanf to read one word; or 2) use fgetc with a good old simple counter i to put each letter into the buffer array, and then terminate it with 0 before printing.

Upvotes: 1

Related Questions