Reputation: 15

Extracting HTML from url using Perl

I want to extract the HTML code of a TWiki (who's URL i have). What is the best possible way of doing that?

Additionally, once i extract the HTML code i need to out it in a site hosted on Google Sites. Is that possible?

Upvotes: 2

Answers (2)

Miguel Prz

Reputation: 13792

A very simple way to get a HTML page is the LWP::Simple module. If you have to do a more complex navigation flow, then use WWW::Mechanize. Then, if you need to parse the HTML code, the @brian solution is good.

Upvotes: 2

Brian Agnew

Reputation: 272267

Sounds like you need the CPAN HTML::Parser module.

use HTML::Parser ();

 # Create parser object
 $p = HTML::Parser->new( api_version => 3,
                         start_h => [\&start, "tagname, attr"],
                         end_h   => [\&end,   "tagname"],
                         marked_sections => 1,
                       );
# Parse directly from file
 $p->parse_file("foo.html");

Upvotes: 1

Extracting HTML from url using Perl

Answers (2)

Related Questions