How can I make my Perl script use multiple cores for child processes?

I'm working on a mathematical model that uses data generated from XFOIL, a popular aerospace tool used to find the lift and drag coefficients on airfoils.

I have a Perl script that calls XFOIL repeatedly with different input parameters to generate the data I need. I need XFOIL to run 5,600 times, at around 100 seconds per run, soabout 6.5 days to complete.

I have a quad-core machine, but my experience as a programmer is limited, and I really only know how to use basic Perl.

I would like to run four instances of XFOIL at a time, all on their own core. Something like this:

while ( 1 ) {

    for ( i = 1..4 ) {

        if ( ! exists XFOIL_instance(i) ) {

            start_new_XFOIL_instance(i, input_parameter_list);
        }
    }
}

So the program is checking (or preferably sleeping) until an XFOIL instance is free, when we can start a new instance with the new input parameter list.

Upvotes: 16

Answers (5)

ashraf

Reputation: 557

This is quite old but if someone is still looking for suitable answers to this question, you might want to consider Perl Many-Core-Engine (MCE)

Upvotes: 0

user1147654

Reputation:

Did you consider gnu parallel parallel. It will allow you to run several install instances of your program with different inputs and fill your CPU cores as they begin available. It's often a very simple an efficient way to achieve parallelization of simple tasks.

Upvotes: 0

Schwern

Reputation: 164809

Perl threads will take advantage of multiple cores and processors. The main pro of threads is its fairly easy to share data between the threads and coordinate their activities. A forked process cannot easily return data to the parent nor coordinate amongst themselves.

The main cons of Perl threads is they are relatively expensive to create compared to a fork, they must copy the entire program and all its data; you must have them compiled into your Perl; and they can be buggy, the older the Perl, the buggier the threads. If your work is expensive, the creation time should not matter.

Here's an example of how you might do it with threads. There's many ways to do it, this one uses Thread::Queue to create a big list of work your worker threads can share. When the queue is empty, the threads exit. The main advantages are that its easier to control how many threads are active, and you don't have to create a new, expensive thread for each bit of work.

This example shoves all the work into the queue at once, but there's no reason you can't add to the queue as you go. If you were to do that, you'd use dequeue instead of dequeue_nb which will wait around for more input.

use strict;
use warnings;

use threads;
use Thread::Queue;

# Dummy work routine
sub start_XFOIL_instance {
    my $arg = shift;
    print "$arg\n";
    sleep 1;
}

# Read in dummy data
my @xfoil_args = <DATA>;
chomp @xfoil_args;

# Create a queue to push work onto and the threads to pull work from
# Populate it with all the data up front so threads can finish when
# the queue is exhausted.  Makes things simpler.
# See https://rt.cpan.org/Ticket/Display.html?id=79733
my $queue = Thread::Queue->new(@xfoil_args);

# Create a bunch of threads to do the work
my @threads;
for(1..4) {
    push @threads, threads->create( sub {
        # Pull work from the queue, don't wait if its empty
        while( my $xfoil_args = $queue->dequeue_nb ) {
            # Do the work
            start_XFOIL_instance($xfoil_args);
        }

        # Yell when the thread is done
        print "Queue empty\n";
    });
}

# Wait for threads to finish
$_->join for @threads;

__DATA__
blah
foo
bar
baz
biff
whatever
up
down
left
right

Upvotes: 5

James Thompson

Reputation: 48172

Try Parallel::ForkManager. It's a module that provides a simple interface for forking off processes like this.

Here's some example code:

#!/usr/bin/perl

use strict;
use warnings;
use Parallel::ForkManager;

my @input_parameter_list = 
    map { join '_', ('param', $_) }
    ( 1 .. 15 );

my $n_processes = 4;
my $pm = Parallel::ForkManager->new( $n_processes );
for my $i ( 1 .. $n_processes ) {
    $pm->start and next;

    my $count = 0;
    foreach my $param_set (@input_parameter_list) {         
        $count++;
        if ( ( $count % $i ) == 0 ) {
            if ( !output_exists($param_set) ) {
                start_new_XFOIL_instance($param_set);
            }
        }
    }

    $pm->finish;
}
$pm->wait_all_children;

sub output_exists {
    my $param_set = shift;
    return ( -f "$param_set.out" );
}

sub start_new_XFOIL_instance {
    my $param_set = shift;
    print "starting XFOIL instance with parameters $param_set!\n";
    sleep( 5 );
    touch( "$param_set.out" );
    print "finished run with parameters $param_set!\n";
}

sub touch {
    my $fn = shift;
    open FILE, ">$fn" or die $!;
    close FILE or die $!;
}

You'll need to supply your own implementations for the start_new_XFOIL_instance and the output_exists functions, and you'll also want to define your own sets of parameters to pass to XFOIL.

Upvotes: 17

Daniel

Reputation: 7172

This looks like you can use gearman for this project.

www.gearman.org

Gearman is a job queue. You can split your work flow into a lot of mini parts.

I would recommend using amazon.com or even their auction able servers to complete this project.

Spending 10cents per computing hour or less, can significantly spead up your project.

I would use gearman locally, make sure you have a "perfect" run for 5-10 of your subjobs before handing it off to an amazon compute farm.

Upvotes: 4

How can I make my Perl script use multiple cores for child processes?

Answers (5)

Related Questions