Reputation: 3
I'm trying to build a code that would convert HTML into text file.
#include <stdio.h>
#include <stdlib.h>
#define BUFLEN 2048
int main(){
FILE *fp;
fp = fopen("tc.txt", "r");
int i = 0;
int j = 0;
char storage[BUFLEN];
char title[100];
fread(storage, 1, sizeof(storage), fp);
for(i=0; storage[i]; i++){
if(storage[i] == '<' && storage [i+1] == 't'){
for(i=i+7; storage[i] != '<'; j++){
title[j] = storage[i];
}
}
}
puts(title);
fclose(fp);
return 0;
}
Basically what I'm trying to do is to look for the <title> block in an html(converted to txt file) then copy whatever comes after until the program reaches '<', which signifies the </title> block.
However, when I run the program, segmentation fault occurs.
Upvotes: 0
Views: 60
Reputation: 2472
In this line :
if(storage[i] = '<' && storage [i+1] = 't'){
You are assinging '<'
to storage[i]
. Change it to if(storage[i] == '<' && storage [i+1] == 't'){
to check for equality.
Also in for(i=0; storage[i]; i++)
it's better to iterate until (<
) the number of bytes read. fread() returns the number of elements successfully read. If the the contents of the file are not read you are checking existing values in the array which are not initialised. You should use memset()
to initialize the array to 0.
As it currently stands, if you go past the size of BUFLEN
with the [i+1]
index, you'll be reading outside the allocated memory for the array. You can also go past the number of bytes read and check something that wasn't read and/or initialised.
Upvotes: 2
Reputation: 777
So I suspect that your Seg Fault may be coming from your last for loop.
for( i=i+7; storage[i] != '<'; j++ ){
title[j] = storage[i];
}
You aren't updating the value for i
in this loop, since, if you wanted to, I believe it would need to be right next to the j++
. So, your loop keeps running and incrementing j
, with i
never changing, until you get to the end of the title array. Then you try to access memory the title array doesn't have, and there comes your Seg Fault.
Upvotes: 2
Reputation: 409216
You have undefined behavior in your code, more specifically the condition storage[i]
in your loop.
Local variables, including arrays, are not initialized, their values are indeterminate, and using those indeterminate values leads to undefined behavior. Using the condition storage[i]
in the loop might even cause you to iterate out of bounds of the array.
You need to use the return value of fread
to know how many characters were read, and use that as an upper limit in the loop. And then you have to think about what happens with storage [i+1]
when i
is at the upper limit already.
Upvotes: 1