Reputation: 1653
I have to do a rle algorithm in c with the escape character (Q)
example if i have an input like: AAAAAAABBBCCCDDDDDDEFG
the output have to be: QA7BBBCCCQD6FFG
this is the code that i made:
#include <stdio.h>
#include <stdlib.h>
void main()
{
FILE *source = fopen("Test.txt", "r");
FILE *destination = fopen("Dest.txt", "w");
char carCorrente; //in english: currentChar
char carSucc; // in english: nextChar
int count = 1;
while(fread(&carCorrente, sizeof(char),1, source) != 0) {
if (fread(&carCorrente, sizeof(char),1, source) == 0){
if(count<=3){
for(int i=0;i<count;i++){
fprintf(destination,"%c",carCorrente);
}
}
else {
fwrite("Q",sizeof(char),1,destination);
fprintf(destination,"%c",carCorrente);
fprintf(destination,"%d",count);
}
break;
}
else fseek(source,-1*sizeof(char), SEEK_CUR);
while (fread(&carSucc, sizeof(char), 1, source) != 0) {
if (carCorrente == carSucc) {
count++;
}
else {
if(count<=3){
for(int i=0;i<count;i++){
fprintf(destination,"%c",carCorrente);
}
}
else {
fwrite("Q",sizeof(char),1,destination);
fprintf(destination,"%c",carCorrente);
fprintf(destination,"%d",count);
}
count = 1;
goto OUT;
}
}
OUT:fseek(source,-1*sizeof(char), SEEK_CUR); //exit 2° while
}
}
the problem is when i have an input like this: ABBBCCCDDDDDEFGD
in this case the output is: QB4CCCQD5FFDD
and i don't know why :(
Upvotes: 0
Views: 8744
Reputation: 1653
Thank you so much, i fixed my algorithm. The problem was a variable, in the first if after the while. Before
if (fread(&carCorrente, sizeof(char),1, source) == 0)
now
if (fread(&carSucc, sizeof(char),1, source) == 0){
for sure all my algorithm is wild. I mean it is too much slow!
i made a test with my version and with the version of Vikram Bhat and i saw how much my algorithm losts time.
For sure with getc() i can save more time.
now i'm thinking about the encoding (decompression) and i can see a little problem.
example:
if i have an input like: QA7QQBQ33TQQ10QQQ
how can i recognize which is the escape character ???
thanks
Upvotes: 0
Reputation: 42149
I think the major problem in your approach is that it's way too complicated with multiple different places where you read input and seek around in the input. RLE can be done in one pass, there should not be a need to seek to the previous characters. One way to solve this is to change the logic into looking at the previous characters and how many times they have been repeated, instead of trying to look ahead at future characters. For instance:
int repeatCount = 0;
int previousChar = EOF;
int currentChar; // type changed to 'int' for fgetc input
while ((currentChar = fgetc(source)) != EOF) {
if (currentChar != previousChar) {
// print out the previous run of repeated characters
outputRLE(previousChar, repeatCount, destination);
// start a new run with the current character
previousChar = currentChar;
repeatCount = 1;
} else {
// same character repeated
++repeatCount;
}
}
// output the final run of characters at end of input
outputRLE(previousChar, repeatCount, destination);
Then you can just implement outputRLE
to do the output to print out a run of the character c
repeated count
times (note that count
can be 0); here's the function declaration:
void outputRLE(const int c, const int count, FILE * const destination)
You can do it pretty much the same way as in your current code, although it can be simplified greatly by combining the fwrite
and two fprintf
s to a single fprintf
. Also, you might want to think what happens if the escape character 'Q'
appears in the input, or if there is a run of 10 or more repeated characters. Deal with those cases in outputRLE
.
An unrelated problem in your code is that the return type of main
should be int
, not void
.
Upvotes: 1
Reputation: 6246
There is no need to use Fseek to rewind as u have done , Here is a code that is have written without using it by using simple counter & current sequence character.
C implementation:
#include<stdio.h>
#include<stdlib.h>
void main()
{
FILE *source = fopen("Test.txt", "r");
FILE *destination = fopen("Dest.txt", "w");
char currentChar;
char seqChar;
int count = 0;
while(1) {
int flag = (fread(¤tChar, sizeof(char),1, source) == 0);
if(flag||seqChar!=currentChar) {
if(count>3) {
char ch = 'Q';
int k = count;
char str[100];
int digits = sprintf(str,"%d",count);
fwrite(&ch,sizeof(ch),1,destination);
fwrite(&seqChar,sizeof(ch),1,destination);
fwrite(&str,sizeof(char)*digits,1,destination);
}
else {
for(int i=0;i<count;i++)
fwrite(&seqChar,sizeof(char),1,destination);
}
seqChar = currentChar;
count =1;
}
else count++;
if(flag)
break;
}
fclose(source);
fclose(destination);
}
Upvotes: 2
Reputation: 29126
Your code has various problems. First, I'm not sure whether you should read straight from the file. In your case, it might be better to read the source string to a text buffer first with fgets
and then do the encoding. (I think in your assignment, you should only encode letters. If source
is a regular text file, it will have at least one newline.)
But let's assume that you need to read straight from the disk: You don't have to go backwards. You already habe two variables for the current and the next char. Read the next char from disk once. Before reading further "next chars", assign the :
int carSucc, carCorr; // should be ints for getc
carSucc = getc(source); // read next character once before loop
while (carSucc != EOF) { // test for end of input stream
int carCorr = next; // this turn's char is last turn's "next"
carSucc = getc(source);
// ... encode ...
}
The going forward and backward makes the loop complicated. Besides, what happens if the second read read zero characters, i.e. has reached the end of the file? Then you backtrace once and go into the second loop. That doesn't look as if it was intended.
Try to go only forward, and use the loop above as base for your encoding.
Upvotes: 1