Azaghal
Azaghal

Reputation: 430

Use perl WWW::Mechanize on a local file

I'm currently working on a Perl script and I use the CPAN module WWW:Mechanize to get HTML pages from websites. However, I would like to be able to work on offline HTML files as well (that I would save myself beforehand most likely) so I don't need the internet each time I'm trying a new script. So basically my question is how can I transform this :

$mech->get( 'http://www.websiteadress.html' );

into this :

$mech->get( 'C:\User\myfile.html' );

I've seen that file:// could be useful but I obviously don't know how to use it as I get errors every time.

Upvotes: 4

Views: 828

Answers (1)

Dave Cross
Dave Cross

Reputation: 69314

The get() method from WWW::Mechanize takes a URL as its argument. So you just need to work out what the correct URL is for your local file. You're on the right lines with the "file://" scheme.

I think you will need:

$mech->get( 'file:///C:/User/myfile.html' );

Note two important things that people often get wrong.

  1. URLs only understand forward slashes (/), so you need to convert Windows' warped backslash (\) monstrosities. Update: As Borodin points out in a comment, this isn't true - you can use backslashes in URLs. However, backslashes often have special meanings in Perl strings, so I'd advise using forward slashes whenever possible.
  2. The scheme is file, which is followed by :// (with two slashes), then the hostname (which is an empty string) a slash (/) and then your local path (C:/). So that means that there are three slashes after file:. That seems wrong, so people often omit one of them. Update: description made more accurate following advice from Borodin in a comment.

Wikipedia (as always) has a lot more information - file URI scheme

Upvotes: 6

Related Questions