Reputation: 15
I want to use threads in Perl to increase the speed of my program ... for example i want to use 20 threads in this code:
use IO::Socket;
my $in_file2 = 'rang.txt';
open DAT,$in_file2;
my @ip=<DAT>;
close DAT;
chomp(@ip);
foreach my $ip(@ip)
{
$host = IO::Socket::INET->new(
PeerAddr => $ip,
PeerPort => 80,
proto => 'tcp',
Timeout=> 1
)
and open(OUT, ">>port.txt");
print OUT $ip."\n";
close(OUT);
}
In the above code we give a list of ips and scan a given port. I want use threads in this code. Is there any other way to increase the speed of my code?
Thanks.
Upvotes: 2
Views: 3053
Reputation: 53478
Perl can do both threading and forking. "threads" is officially not recommended - in no small part because it's not well understood, and - perhaps slightly counterintutively - isn't lightweight like threads are in some programming languages.
If you are particularly keen to thread, the 'worker' model of threading works much better than spawning a thread per task. You might do the latter in some languages - in perl it's very inefficient.
As such you might do something like this:
#!/usr/bin/env perl
use strict;
use warnings;
use threads;
use Thread::Queue;
use IO::Socket;
my $nthreads = 20;
my $in_file2 = 'rang.txt';
my $work_q = Thread::Queue->new;
my $result_q = Thread::Queue->new;
sub ip_checker {
while ( my $ip = $work_q->dequeue ) {
chomp($ip);
$host = IO::Socket::INET->new(
PeerAddr => $ip,
PeerPort => 80,
proto => 'tcp',
Timeout => 1
);
if ( defined $host ) {
$result_q->enqueue($ip);
}
}
}
sub file_writer {
open( my $output_fh, ">>", "port.txt" ) or die $!;
while ( my $ip = $result_q->dequeue ) {
print {$output_fh} "$ip\n";
}
close($output_fh);
}
for ( 1 .. $nthreads ) {
push( @workers, threads->create( \&ip_checker ) );
}
my $writer = threads->create( \&file_writer );
open( my $dat, "<", $in_file2 ) or die $!;
$work_q->enqueue(<$dat>);
close($dat);
$work_q->end;
foreach my $thr (@workers) {
$thr->join();
}
$result_q->end;
$writer->join();
This uses a queue to feed a set of (20) worker threads with an IP list, and work their way through them, collating and printing results through the writer
thread.
But as threads aren't really recommended any more, a better way might be to use Parallel::ForkManager
which with your code might go a bit like this:
#!/usr/bin/env perl
use strict;
use warnings;
use Fcntl qw ( :flock );
use IO::Socket;
my $in_file2 = 'rang.txt';
open( my $input, "<", $in_file2 ) or die $!;
open( my $output, ">", "port.txt" ) or die $!;
my $manager = Parallel::ForkManager->new(20);
foreach my $ip (<$input>) {
$manager->start and next;
chomp($ip);
my $host = IO::Socket::INET->new(
PeerAddr => $ip,
PeerPort => 80,
proto => 'tcp',
Timeout => 1
);
if ( defined $host ) {
flock( $output, LOCK_EX ); #exclusive or write lock
print {$output} $ip, "\n";
flock( $output, LOCK_UN ); #unlock
}
$manager->finish;
}
$manager->wait_all_children;
close($output);
close($input);
You need to be particularly careful of file IO when multiprocessing, because the whole point is your execution sequence is no longer well defined. So it's insanely easy to end up with different threads clobbering files that another thread has open, but hasn't flushed to disk.
I note your code - you seem to rely on failing a file open, in order to not print to it. That's not a nice thing to do, especially when your file handle is not lexically scoped.
But in both multiprocessing paradigms I outlined above (there are others, these are the most common) you still have to deal with the file IO serialisation. Note that your 'results' will be in a random order in both, because it'll very much depend on when the task completes. If that's important to you, then you'll need to collate and sort after your threads or forks complete.
It's probably generally better to look towards forking - as said above, in threads
docs:
The "interpreter-based threads" provided by Perl are not the fast, lightweight system for multitasking that one might expect or hope for. Threads are implemented in a way that make them easy to misuse. Few people know how to use them correctly or will be able to provide help. The use of interpreter-based threads in perl is officially discouraged.
Upvotes: 3
Reputation: 118128
Instead of using threads, you might want to look into AnyEvent::Socket, or Coro::Socket, or POE, or Parallel::ForkManager.
Upvotes: 6