Reputation: 31
I'm pretty new to perl (and programming too) and were toying around with threads for the last couple of weeks and so far I understood that using them to perform some similar parallel tasks is descouraged - memory consumption is uncontrollable if your number of threads depends on some input values, and simply limiting that number and doing some interim joins seems pretty much silly. So I've tried to trick threads to return me some values through queues followed by detaching those threads (and without actually joining them) - here's an example with parallel ping:
#!/usr/bin/perl
#
use strict;
use warnings;
use threads;
use NetAddr::IP;
use Net::Ping;
use Thread::Queue;
use Thread::Semaphore;
########## get my IPs from CIDR-notation #############
my @ips;
for my $cidr (@ARGV) {
my $n = NetAddr::IP->new($cidr);
foreach ( @{ $n->hostenumref } ) {
push @ips, ( split( '/', $_ ) )[0];
}
}
my $ping = Net::Ping->new("icmp");
my $pq = Thread::Queue->new( @ips, undef ); # ping-worker-queue
my $rq = Thread::Queue->new(); # response queue
my $semaphore = Thread::Semaphore->new(100); # I hoped this may be usefull to limit # of concurrent threads
while ( my $phost = $pq->dequeue() ) {
$semaphore->down();
threads->create( { 'stack_size' => 32 * 4096 }, \&ping_th, $phost );
}
sub ping_th {
$rq->enqueue( $_[0] ) if $ping->ping( $_[0], 1 );
$semaphore->up();
threads->detach();
}
$rq->enqueue(undef);
while ( my $alive_ip = $rq->dequeue() ) {
print $alive_ip, "\n";
}
I couldn't find a fully comprehensive description of how threads->detach() should work from within a threaded subroutine and thought that this might work... and it does - if I do something in the main program (thread) that stretches it's lifetime (sleep does well), so all the detached threads finish up and enqueue their part to my $rq, otherwise it will run some threads collect their results to the queue and exit with warnings like:
Perl exited with active threads:
5 running and unjoined
0 finished and unjoined
0 running and detached
Making the main program "sleep" for a while, once again, seems silly - is there no way to make threads do their stuff and detach ONLY after the actual threads->detach() call? So far my guess is that threads->detach() inside a sub applies as soon as the thread is created and so this is not the way. I tried this out with CentOSs good old v5.10.1. Should this change with modern v5.16 or v5.18 (usethreads-compiled)?
Upvotes: 3
Views: 2126
Reputation: 53478
Detaching a thread isn't particularly useful, because you're effectively saying 'I don't care when they exit'.
This isn't typically what you want - your process is finishing with thread still running.
Generally though - creating threads has an overhead, because your processs is cloned in memory. You want to avoid doing this. Thread::Queue
is also good to use, because it's a thread safe way of transferring information. In your code, you don't actually need it for $pq
because you're not actually threading at the point where you're using it.
Your semaphore is one approach to doing it, but can I suggest as an alternative:
#!/usr/bin/perl
use strict;
use warnings;
use Thread::Queue;
my $nthreads = 100;
my $ping_q = Thread::Queue -> new();
my $result_q = Thread::Queue -> new();
sub ping_host {
my $pinger = Net::Ping->new("icmp");
while ( my $hostname = $ping_q -> dequeue() ) {
if ( $pinger -> ping ( $hostname, 1 ) ) {
$result_q -> enqueue ( $hostname );
}
}
}
#start the threads
for ( 1..$nthreads ) {
threads -> create ( \&ping_host );
}
#queue the workload
$ping_q -> enqueue ( @ip_list );
#close the queue, so '$ping_q -> dequeue' returns undef, breaking the while loop.
$ping_q -> end();
#wait for pingers to finish.
foreach my $thr ( threads -> list() ) {
$thr -> join();
}
$results_q -> end();
#collate results
while ( my $successful_host = $results_q -> dequeue_nb() ) {
print $successful_host, "\n";
}
This way you spawn the threads up front, queue the targets and then collate the results when you're done. You don't incur the overhead for repeatedly respawning threads, and you program will wait until all the threads are done. Which may be a while, because the ping timeout on a 'down' host will be quite a while.
Upvotes: 7
Reputation: 50637
Since detached threads can't be joined, you can wait for threads to finish their jobs,
sleep 1 while threads->list();
Upvotes: 1