Neon Flash
Neon Flash

Reputation: 3233

WWW::Mechanize in Perl, Script Gets Killed

I have written a Perl Script which uses WWW::Mechanize to connect to a site, login and then visit a few pages inside the site. It all works good, however, when I try to visit a large number of pages, the script gets killed. I am sure this has got nothing to with the HTTP Server's Configuration and the connection limits configured. This is because, the script is running on my own site.

Here's a high level overview of my script:

$url="http://example.com";
$mech=WWW::Mechanize->new();
$mech->cookie_jar(HTTP::Cookies->new());
$mech->get($url);

login to the site using the form fields.

Now, once I am logged in, I connect to URLs within the site as follows:

$i is the iteration counter in a for loop

$internal_url="http://example.com/index.php?page=$i";

$mech->get($internal_url);

perform some operations on the page returned ($mech->content using HTML::TreeBuilder::XPath)

now, I iterate over the for loop connecting to a different internal_url, since the value of $i is incremented in every iteration.

As I said, it all works good. However, after about 180 pages, the script gets killed.

What could be the reason? I have tried multiple times.

I even added a $mech->delete; right before the end of the FOR loop to prevent any memory leak.

However, the only issue is that the login session which was maintained by $mech would be destroyed as a result of this.

I have tried multiple times and this script always gets killed after visiting the same number of pages.

Thanks.

Upvotes: 0

Views: 897

Answers (1)

gangabass
gangabass

Reputation: 10666

Try this code:

$mech=WWW::Mechanize->new();
$mech->stack_depth(0);

OR

$mech=WWW::Mechanize->new(stack_depth=>0);

According to the docs: Get or set the page stack depth. Use this if you're doing a lot of page scraping and running out of memory.

A value of 0 means "no history at all." By default, the max stack depth is humongously large, effectively keeping all history.

Upvotes: 3

Related Questions