Reputation: 23
Here's my code:
use LWP;
use threads;
use strict;
use warnings;
my @thrds;
for (1..100)
{
push @thrds, threads->create( 'doit' );
}
for (@thrds)
{
$_->join();
}
sub doit
{
LWP::UserAgent->new->get("http://dx.doi.org/10.1002/aoc.1067");
}
I'm using Windows 7 x64 and ActivePerl 5.20.2 x64, I also tried StrawberryPerl. I've got bunch of errors:
Thread ... terminated abnormally: Can't locate object method "_uric_escape" via package "URI" at .../URI.pm line 81
String found where operator expected at (eval 10) line 8, near "croak 'usage: $io->getlines()'"
(Do you need to predeclare croak?)
If I add
sleep 1;
before
push @thrds, threads->create( 'doit' );
it'll be ok.
What's the problem?
Upvotes: 0
Views: 983
Reputation: 383
I know this is an old topic, but just yesterday I ran into this problem. Below info can be useful for someone. I was getting several kinds of errors from LWP in similar code (parallel http requests), including
Problem seems to be gone after I added the following line:
use IO::File;
Don't ask me why, I have no idea :) Figured this out accidentally. I am using Strawberry Perl 5.24.0.
Upvotes: 0
Reputation: 126722
This is very straightforward using the Mojolicious framework. The Mojo::UserAgent
class has been designed to work asynchronously with the help of the Mojo::IOLoop
module, and although synchronous transactions are available they are implemented using special cases of the standard asynchronous calls
use strict;
use warnings;
use Mojo;
my $ua = Mojo::UserAgent->new( max_redirects => 5 );
my $n;
for ( 1 .. 100 ) {
$ua->get('http://dx.doi.org/10.1002/aoc.1067' => \&completed);
++$n;
}
Mojo::IOLoop->start;
sub completed {
my ($ua, $tx) = @_;
Mojo::IOLoop->stop unless --$n > 0;
}
A quick benchmark gave the following results
Upvotes: 0
Reputation: 53478
I think the problem here will be memory footprint. You are - after all - loading the LWP
library, and then cloning your process 100 times.
Contrary to popular belief, threads in perl are not even remotely lightweight. They are not well suited to this model of usage - each thread is a 'full copy' of your process, and that's just ... not a great plan.
My copy of perl - ActivePerl 5.20.2 - doesn't exhibit the same problem (I don't think - I didn't actually want to spam that website you list).
I would suggest instead that you either rewrite your threading to use Thread::Queue
and a lower degree of parallelism:
use strict;
use warnings;
use LWP;
use threads;
use Thread::Queue;
my $workers = 10;
my $work_q = Thread::Queue->new();
my $url = "http://localhost:80";
sub worker_thread {
while ( my $url = $work_q->dequeue ) {
LWP::UserAgent->new->get($url);
}
}
threads->create( \&worker_thread ) for 1 .. $workers;
for ( 1 .. 100 ) {
$work_q->enqueue($url);
}
$work_q->end;
foreach my $thread ( threads->list ) {
$thread->join();
}
Otherwise, a fork-based approach may also work better:
use strict;
use warnings;
use Parallel::ForkManager;
use LWP;
my $manager = Parallel::ForkManager->new(10);
for ( 1 .. 100 ) {
$manager->start and next;
LWP::UserAgent->new->get("http://localhost:80");
$manager->finish;
}
$manager->wait_all_children;
Edit:
A bit more testing on your sample - I do get similar runtime errors on RHEL running 5.20.2
They vary somewhat, which really does mean there has to be some kind of race condition going on here.
Which is odd, because threads are supposed to be standalone and they're not.
In particular - my kernel bombs with a 'kill' because of memory exhaustion thanks to a few gigs of memory footprint, which is a pretty good reason not to use this approach.
My test case of running your code with RHEL and Perl 5.20.2 also triggers problems (occasionally).
What I get is similar sorts of errors - sporadically. I cannot see an obvious source thought. It might be as simple as too many open file descriptors or too much memory consumed. This is quite a hefty memory burden.
Upvotes: 1
Reputation: 385809
I'm not sure why, but there seems to be problems dealing with dynamically-loaded modules. Explicitly loading them before thread creation solves the problem. In other words, add the following:
use Carp qw( );
use URI qw( );
Upvotes: 2