Filip Jerga
Filip Jerga

Reputation: 75

Reading HTML text using C++

I am currently reading book from Allex Allain and there is practice problem: enter image description here

And i dont know how to think about this problem and i am somehow stuck , should i first find every tag and save it to array or vector ? and then compare tags from vector with original string ? and make som conditions ? I dont looking for code from you i want to solve it by myself i am just looking for inspiration or some ideas , or some useful methods i could use. thank you.

Upvotes: 1

Views: 2065

Answers (3)

Mr. Kumar
Mr. Kumar

Reputation: 26

using namespace std;

void main ()
{
    ifstream x; 
    string name,head="<html><head></head><body><table>", tail="</table></body></html>", bodystart="<tr><td>",bodyclose="</td></tr>";
    ofstream y;
    x.open("example.txt");
    y.open("myhtmlfile.html");
    y<<head;
    while(!x.eof())
    {
        x>>name; 
        y<<bodystart<<name<<bodyclose;
    }
    y<<tail;

    x.close();
    cout<<"\n\n";
}

Upvotes: 0

MSD561
MSD561

Reputation: 522

You should make a Parser.

Read each word and if you find tag then find the next <tag>. If this is the opposite tag </tag> then you could create object from that tag.

Solution proposed imply to create interface named tag and derived class named <html>, <head>.

So in final you will have a motor (parser) which eats text and produces object.

Upvotes: 1

mmoment
mmoment

Reputation: 1289

Yes like @MSD561 he could write a parser. Either from scratch and reinvent the wheel, or use a library.

An XML-library can be used to achieve the second and get a better understanding for the structures:

What XML parser should I use in C++?

It will also provide you with all the entries for the tags etc and you'll just have to parse through the xml tree.

Upvotes: 0

Related Questions