baski
baski

Reputation: 33

start/continue parent process only after completing child process

I want to do some parallel child process but, all the child process are done then only i want to start/continue my parent process. below is my sample code

    foreach ('abc.gz','efg.gz','123.gz','xyz.gz')
    {
        my $pid = fork;
        if ($pid == 0) {            
            exec("tar -xvf $_");
            exit 0;
        }
    }
    wait();
    `tar -xvf 'parent.gz'`;

in the above code i want to extract my "parent.gz" at the end of all the child process extraction. but, it is extracting the "parent.gz" in the middle of the child process. So, please help me.

I can use only perl core modules, my perl version is v5.10.1

Thanks Baski

Upvotes: 1

Views: 310

Answers (1)

Sobrique
Sobrique

Reputation: 53508

The problem here is - fork is all about parallel processing, and your wait() call doesn't necessarily wait on all children. Indeed - it only waits for the first, and returns the pid. See: perldoc wait

I will also point out though - you're probably not going to gain much by forking here. fork doesn't make anything go faster - it merely lifts some contention blocks, and lets you use certain resources (cpus) in parallel. Your limiting factor here is probably not the CPU (it might be, thanks to the decompression - if you weren't doing .gz files it almost certainly wouldn't be). But disks are usually the slowest thing on your system, and that'll likely be your limiting factor. By parallel writing you won't therefore gain much speed at all.

Also: You probably want the z flag in your tar command, because they look like gzip compressed files (I'm assuming they're also tarballs, because this makes no sense at all if they're not).

By far the easiest way of doing this is the Parallel::ForkManager module:

use strict;
use warnings;
use Parallel::ForkManager;

my $manager = Parallel::ForkManager -> new ( 10 ); 
foreach ('abc.gz','efg.gz','123.gz','xyz.gz')
{
    $manager -> start and next;            
        exec("tar -xvf $_");
        exit 0;   #probably redundant, because exec will mean it's not called. 
    $manager -> finish;
}
$manager -> wait_all_children
`tar -xvf 'parent.gz'`;

However if you're set on not using extra modules (which really - is a limitation that comes up frequently, and is a bad limitation to apply). You could simply call wait 3 times, but I don't like that solution, because it can be tripped up by other implicit forks.

So I would suggest use waitpid() instead.

my @pids; 
foreach ('abc.gz','efg.gz','123.gz','xyz.gz')
{
    my $pid = fork;
    if ($pid == 0) {            
        exec("tar -xvf $_");
        exit 0;
    } 
    else { 
       push ( @pids, $pid ); 
    }
}
foreach my $pid ( @pids ) {
   print "Waiting for pid $pid\n"; 
   waitpid ( $pid, 0 ); 
}

`tar -xvf 'parent.gz'`;

Alternatively, as a way to reap all children:

my $result = wait(); 
while ( $result >= 0 ) {
    $result = wait(); 
}

wait should return -1 if there's no children, which will break the loop.

Or as ikegami points out this can be reduced to:

1 while wait > 0;

(Which does the same thing, more concisely)

Upvotes: 3

Related Questions