Sam Stern
Sam Stern

Reputation: 25134

Malloc keeps returning same pointer

I have some C code to parse a text file, first line by line and then into tokens

This is the function that parses it line by line:

int parseFile(char *filename) {
//Open file
FILE *file = fopen(filename, "r");
//Line, max is 200 chars
int pos = 0;
while (!feof(file)) {
    char *line = (char*) malloc(200*sizeof(char));
    //Get line
    line = fgets(line, 200, file);
    line = removeNewLine(line);
    //Parse line into instruction
    Instruction *instr = malloc(sizeof(instr));
    instr = parseInstruction(line, instr);
    //Print for clarification
    printf("%i: Instr is %s arg1 is %s arg2 is %s\n", 
        pos,
        instr->instr,
        instr->arg1,
        instr->arg2);
    //Add to end of instruction list
    addInstruction(instr, pos);
    pos++;
    //Free line
    free(line);
}
return 0;

}

And this is the function that parses each line into some tokens and eventually puts it into an Instruction struct:

Instruction *parseInstruction(char line[], Instruction *instr) {
//Parse instruction and 2 arguments
char *tok = (char*) malloc(sizeof(tok));
tok = strtok(line, " ");
printf("Line at %i tok at %i\n", (int) line, (int) tok);
instr->instr = tok;
tok = strtok(NULL, " ");
if (tok) {
    instr->arg1 = tok;
    tok = strtok(NULL, " ");
    if(tok) {
        instr->arg2 = tok;
    }
}
return instr;

}

the line printf("Line at %i tok at %i\n", (int) line, (int) tok); in ParseInstruction always prints the same two values, why are these pointer addresses never changing? I have confirmed that parseInstruction returns a unique pointer value each time, but each instruction has the same pointer in it's instr slot.

Just for clarity, Instruction is defined like this:

typedef struct Instruction {

char *instr;
char *arg1;
char *arg2;

} Instruction;

What am I doing wrong?

Upvotes: 0

Views: 936

Answers (4)

Matt Patenaude
Matt Patenaude

Reputation: 4877

There are a handful of problems that I can see with the code. The most egregious one is in parseInstruction:

char *tok = (char*) malloc(sizeof(tok));
tok = strtok(line, " ");

Here, you allocate memory for tok, but then you set tok equal to the result of strtok, which effectively makes your allocated memory unreachable (it's never used). Not only is this is a memory leak, but it means that tok == line will always be true, because strtok returns a pointer to the first token (which, when called with the first parameter != NULL, will generally be the start of the string).

That explains your first problem (that the values are the same). As to the problem that your values are always the same in repeated iterations, Edmund's answer sums it up well: completely by accident, you're free'ing and malloc'ing the same chunk of memory.

You should get rid of the char *tok = malloc(...) line, and simply let strtok operate on the string in place as intended. That will fix your memory leak. You'll still see the same values in your print out though, because that's how strtok works.

One serious problem that you will eventually run into with this code, though, is that you're assigning to the Instruction struct with pointers that point into the memory space of line. This works fine during each iteration of the loop, but when you free at the end of each iteration, that memory goes away, only to be reclaimed by someone else. It might work by happy accident now, but eventually you'll SEGFAULT all over the place. If you want each struct Instruction to have its own private memory, you'll want to do something more like this:

// You already do this part, but with a bug; use this code instead
Instruction *instr = (Instruction *)malloc(sizeof(Instruction));

// Then, inside parseInstruction, make these relevant changes
instr->instr = (char *)malloc(strlen(tok) + 1);
strcpy(instr->instr, tok);

instr->arg1 = (char *)malloc(strlen(tok) + 1);
strcpy(instr->arg1, tok);

instr->arg2 = (char *)malloc(strlen(tok) + 1);
strcpy(instr->arg2, tok);

The gist here is that you want to malloc a new chunk of memory for each field in your struct, of size strlen(...) + 1 (the "+1" is for the NUL byte), and then copy the string into that new memory.

When it comes time to free, make sure you free all of that memory:

// free each field
free(instr->instr);
free(instr->arg1);
free(instr->arg2);

// free the struct itself
free(instr);

There are a few other things that could be done to clean up your code (for instance, a lot of your assignments are unnecessary: fgets(line, 200, file) has the same effect as line = fgets(line, 200, file)), but this should get you on the right track.

Upvotes: 2

Edmund
Edmund

Reputation: 10819

You're allocating memory for a line, processing it, then freeing it. (You also allocate other memory for Instructions and don't free it, but that's separate.)

What's happening is when you free the memory from the line, there is a hole of size 200 characters (plus some bookkeeping space). The malloc you're using finds this hole and reuses it next time through the loop.

These are all separate allocations, but it just so happens the same space in memory is reused for each one. This should be treated as an accident, of course: a change in other allocations you make could change it.

Upvotes: 1

ruakh
ruakh

Reputation: 183371

That is how strtok works: it actually modifies the string that it's operating on, replacing separator-characters with '\0' and returning pointers into that string. (See the "BUGS" section in the strtok(3) manual page, though it's not really a bug, just a behavior-that-people-don't-usually-expect.) So your initial tok will always point to the first character of line.

By the way, this:

char *tok = (char*) malloc(sizeof(tok));
tok = strtok(line, " ");

first sets tok to point at the return-value of malloc, then re-assigns it to point at the return-value of strtok, thereby completely discarding the return-value of malloc. It's just like how writing this:

int i = some_function();
i = some_other_function();

completely discards the return-value of some_function(); except that it's even worse, because discarding the return-value of malloc results in a memory leak.

Upvotes: 4

Ravindra Bagale
Ravindra Bagale

Reputation: 17655

it must be sizeof(char) not sizeof(tok)

correct it

char *tok = (char*) malloc(sizeof(tok));
    to

char *tok = (char*) malloc(sizeof(char));

Upvotes: 1

Related Questions