Reputation: 87
I'm trying to follow a link in Perl. My initial code:
use WWW::Mechanize::Firefox;
use Crypt::SSLeay;
use HTML::TagParser;
use URI::Fetch;
$ENV{PERL_LWP_SSL_VERIFY_HOSTNAME}=0; #not verifying certificate
my $url = 'https://';
$url = $url.@ARGV[0];
my $mech = WWW::Mechanize::Firefox->new;
$mech->get($url);
$mech->follow_link(tag => 'a', text => '<span class=\"normalNode\">VSCs</span>');
$mech->reload();
I found here that the tag and text options work this way but I got the error MozRepl::RemoteObject: SyntaxError: The expression is not a legal expression. I tried to escape some characters in the text, but the error was still the same. Then I changed my code adding:
my @list = $mech->find_all_links();
my $found = 0;
my $i=0;
while($i<=$#list && $found == 0){
print @list[$i]->url()."\n";
if(@list[$i]->text() =~ /VSCs/){
print @list[$i]->text()."\n";
my $follow =@list[$i]->url();
$mech->follow_link( url => $follow);
}
$i++;
}
But then again there's an error: No link found matching '//a[(@href = "https://... and a lot of more text that seems to be the link's description. I hope I made myself clear, if not, please tell me what else to add. Thanks to all for your help.
Here's the part where the link I want to follow is:
<li id="1" class="liClosed"><span class="bullet clickable"> </span><b><a href="/centcfg/vsc_list.asp?entity=allvsc&selector=All"><span class="normalNode">VSCs</span></a></b>
<ul id="1.l1">
<li id="i1.i1" class="liBullet"><span class="bullet"> </span><b><a href="/centcfg/vsc_edit.asp?entity=vsc&selector=1"><span class="normalNode">First</span></a></b></li>
<li id="i1.i2" class="liBullet"><span class="bullet"> </span><b><a href="/centcfg/vsc_edit.asp?entity=vsc&selector=2"><span class="normalNode">Second</span></a></b></li>
<li id="i1.i3" class="liBullet"><span class="bullet"> </span><b><a href="/centcfg/vsc_edit.asp?entity=vsc&selector=3"><span class="normalNode">Third</span></a></b></li>
<li id="i1.i4" class="liBullet"><span class="bullet"> </span><b><a href="/centcfg/vsc_edit.asp?entity=vsc&selector=4"><span class="normalNode">Fourth</span></a></b></li>
<li id="i1.i5" class="liBullet"><span class="bullet"> </span><b><a href="/centcfg/vsc_edit.asp?entity=vsc&selector=5"><span class="normalNode">None</span></a></b></li>
</ul>
I'm working in Windows 7, MozRepl is version 1.1 and I'm using Strawberry perl 5.16.2.1 for 64 bits
Upvotes: 3
Views: 892
Reputation: 51
After poking around with the given code I was able to make W::M::F to follow the links in a following manner:
use WWW::Mechanize::Firefox;
use Crypt::SSLeay;
use HTML::TagParser;
use URI::Fetch;
...
$mech->follow_link(xpath => '//a[text() = "<span class=\"normalNode\">VSCs</span>"]');
$mech->reload();
Note xpath
parameter given instead of text
.
I didn't take a long look into W::M::F sources, but under the hood it tries to translate given text
parameter into XPath string, and if text
contains number of XML/HTML tags, which is your case, it probably drives him crazy.
Upvotes: 2
Reputation: 185073
I recommend you to try :
$mech->follow_link( url_regex => qr/selector=All/ );
Upvotes: 0