GilAlexander
GilAlexander

Reputation: 23

Perl - Using LWP in multithread program

Here's my code:

use LWP;
use threads;

use strict;
use warnings;

my @thrds;

for (1..100)
{
    push @thrds, threads->create( 'doit' );
}

for (@thrds)
{
    $_->join();
}

sub doit
{
    LWP::UserAgent->new->get("http://dx.doi.org/10.1002/aoc.1067");
}

I'm using Windows 7 x64 and ActivePerl 5.20.2 x64, I also tried StrawberryPerl. I've got bunch of errors:

Thread ... terminated abnormally: Can't locate object method "_uric_escape" via package "URI" at .../URI.pm line 81

String found where operator expected at (eval 10) line 8, near "croak 'usage: $io->getlines()'"

(Do you need to predeclare croak?)

If I add

sleep 1;

before

push @thrds, threads->create( 'doit' );

it'll be ok.

What's the problem?

Upvotes: 0

Views: 983

Answers (4)

AndyH
AndyH

Reputation: 383

I know this is an old topic, but just yesterday I ran into this problem. Below info can be useful for someone. I was getting several kinds of errors from LWP in similar code (parallel http requests), including

  • "String found where operator expected at ... near "croak ..."
  • "Deep recursion on subroutine "IO::Socket::new" ... Out of memory!"
  • Sometimes requests just failed (returned status 500 or 501).

Problem seems to be gone after I added the following line:

use IO::File;

Don't ask me why, I have no idea :) Figured this out accidentally. I am using Strawberry Perl 5.24.0.

Upvotes: 0

Borodin
Borodin

Reputation: 126722

This is very straightforward using the Mojolicious framework. The Mojo::UserAgent class has been designed to work asynchronously with the help of the Mojo::IOLoop module, and although synchronous transactions are available they are implemented using special cases of the standard asynchronous calls

use strict;
use warnings;

use Mojo;

my $ua = Mojo::UserAgent->new( max_redirects => 5 );

my $n;

for ( 1 .. 100 ) {
  $ua->get('http://dx.doi.org/10.1002/aoc.1067' => \&completed);
  ++$n;
}

Mojo::IOLoop->start;

sub completed {
  my ($ua, $tx) = @_;

  Mojo::IOLoop->stop unless --$n > 0;
}

A quick benchmark gave the following results

  • 100 synchronous GET requests took 178 seconds
  • 100 asynchronous GET requests took 10 seconds

Upvotes: 0

Sobrique
Sobrique

Reputation: 53478

I think the problem here will be memory footprint. You are - after all - loading the LWP library, and then cloning your process 100 times.

Contrary to popular belief, threads in perl are not even remotely lightweight. They are not well suited to this model of usage - each thread is a 'full copy' of your process, and that's just ... not a great plan.

My copy of perl - ActivePerl 5.20.2 - doesn't exhibit the same problem (I don't think - I didn't actually want to spam that website you list).

I would suggest instead that you either rewrite your threading to use Thread::Queue and a lower degree of parallelism:

use strict;
use warnings;
use LWP;
use threads;
use Thread::Queue;

my $workers = 10;

my $work_q = Thread::Queue->new();
my $url    = "http://localhost:80";

sub worker_thread {
    while ( my $url = $work_q->dequeue ) {
        LWP::UserAgent->new->get($url);
    }
}

threads->create( \&worker_thread ) for 1 .. $workers;

for ( 1 .. 100 ) {
    $work_q->enqueue($url);
}
$work_q->end;

foreach my $thread ( threads->list ) {
    $thread->join();
}

Otherwise, a fork-based approach may also work better:

use strict;
use warnings;
use Parallel::ForkManager;
use LWP;

my $manager = Parallel::ForkManager->new(10);

for ( 1 .. 100 ) {
    $manager->start and next;
    LWP::UserAgent->new->get("http://localhost:80");
    $manager->finish;
}

$manager->wait_all_children;

Edit:

A bit more testing on your sample - I do get similar runtime errors on RHEL running 5.20.2

They vary somewhat, which really does mean there has to be some kind of race condition going on here.

Which is odd, because threads are supposed to be standalone and they're not.

In particular - my kernel bombs with a 'kill' because of memory exhaustion thanks to a few gigs of memory footprint, which is a pretty good reason not to use this approach.

My test case of running your code with RHEL and Perl 5.20.2 also triggers problems (occasionally).

What I get is similar sorts of errors - sporadically. I cannot see an obvious source thought. It might be as simple as too many open file descriptors or too much memory consumed. This is quite a hefty memory burden.

Upvotes: 1

ikegami
ikegami

Reputation: 385809

I'm not sure why, but there seems to be problems dealing with dynamically-loaded modules. Explicitly loading them before thread creation solves the problem. In other words, add the following:

use Carp qw( );
use URI  qw( );

Upvotes: 2

Related Questions