Requesting clarifcaiton on file position in C

Question

I am currently doing an exercise in KNKING C program a modern approach. This exercise.

(Question start)

Of the many techniques for compressing the contents of a file, one of the simplest and fastest is known as run-length encoding. This technique compresses a file by replacing sequences of identical bytes by a pair of bytes: a repetition count followed by a byte to be repeated. For example, suppose that the file to be compressed begins with the following sequence of bytes (shown in hexadecimal):

46 6F 6F 20 62 61 72 21 21 21 20 20 20 20 20

The compressed file will contain the following bytes:

01 46 02 6F 01 20 01 62 01 61 01 72 03 21 05 20

Run-length encoding works well if the original file contains many long sequences of identical bytes. In the worst case (a file with no repeated bytes), run-length encoding can actually double the length of the file. (Question end)

I have a question regarding my code related to the exercise, this part of the code (and a particular line in question)

int main(int argc, char *argv[])
{

    FILE *fp,*fpout;
    char *outfile;
    unsigned char value,next,count;
    long int position;
    if (argc !=2)
    {
        printf("Error: Incorrect usage of program. Usage: c22p7.exe file
");
        exit(EXIT_FAILURE);
    }
    if ( (fp=fopen(argv[1], "rb")) == NULL)
    {
        printf("Error: Unable to open file
");
        exit(EXIT_FAILURE);
    }
    outfile = malloc(strlen(argv[1]) + 5);
    strcpy(outfile,argv[1]);
    strcat(outfile,".RLE");

    if ( (fpout=fopen(outfile,"wb")) == NULL)
    {
        printf("Error: Unable to open file
");
        exit(EXIT_FAILURE);
    }
    free(outfile);
    while ( fread(&value,sizeof(unsigned char),1,fp) > 0)
    {
        count = 1;
        position = ftell(fp);
        while ( fread(&next,sizeof(unsigned char),1,fp) > 0 && next == value)
        {
                count ++;
        }

        fwrite(&count,sizeof(unsigned char),1,fpout);
        fwrite(&value,sizeof(unsigned char),1,fpout);
        fseek(fp,-1L,SEEK_CUR);  /* THIS PARTICULAR LINE */
    }

    fclose(fp);
    fclose(fpout);

    exit(EXIT_SUCCESS);
}

with regards to fseek(fp,-1L,SEEK_CUR);, my rationale behind it was the program will keep reading the bytes until it has read the first byte that is different. It then moves back by one byte position hence the "-1L", so that on the next loop it will read back the byte. E.g.

01 01 01 01 02 02

It reads all the 01 until it reads the first 02, then fseek() moves the file position back by 1 byte so on the next iteration of the loop it will read the first 02 again. However, if I implement the code this way it doesn't work.

fseek(fpin, position + (amount - 1), SEEK_SET);

^ This works, however. The position is the position of the file after reading the first byte, and the amount is the number of bytes already read. I understand how this particular line of code works, but I do not understand why my SEEK_CUR method doesn't work. Thank you all for the help really

Craig Estey · Accepted Answer

You do not need to back up the file. This will actually produce wrong results for the RLE.

If you must, there is an easier way. Use ungetc(chr,fp);

But, you can do this with a simpler loop. And, if you're going to get input char-at-a-time, use fgetc instead of fread.

You really only need a loop that does a single fgetc at the top of the loop.

You already have the basics: A variable that remembers the current character and another that remembers the previous one [which is the RLE char].

When the character change occurs, you just have to dump the accumulated RLE pair. And, then set the new character as the RLE char and a starting count of one. Setting the value to one is the crucial reason why you don't have to backtrack.

Here's a version that works [tested with the specified data]:

#include 

void
rleout(int rlechr,int rlecnt)
{

    if (rlecnt > 0) {
        fputc(rlecnt,stdout);
        fputc(rlechr,stdout);
    }
}

int
main(void)
{
    int curchr;
    int rlechr = -1;
    int rlecnt = 0;

    while (1) {
        // get next character
        curchr = fgetc(stdin);

        // hit EOF
        if (curchr == EOF)
            break;

        // starting a new RLE
        if (curchr != rlechr) {
            // output the current RLE pair
            rleout(rlechr,rlecnt);

            // set new RLE pair
            rlechr = curchr;
            rlecnt = 1;
            continue;
        }

        // advance number of consecutive chars
        ++rlecnt;

        // if we're using the max space in a byte, we _have_ to dump the RLE
        // pair
        if (rlecnt >= 255) {
            rleout(rlechr,rlecnt);
            rlecnt = 0;
        }
    }

    // output final pair [if any]
    rleout(rlechr,rlecnt);

    return 0;
}

Requesting clarifcaiton on file position in C

Answers (2)

Related Questions