dwillis77
dwillis77

Reputation: 443

Perl Behavioral Differences Closing Child Process Spawned with open() vs. IPC::Open3

I'm trying to figure this out but haven't been able to wrap my head around it. I need to open a piped subprocess and read from its output. Originally I was using the standard open() call like this:

#!/usr/bin/perl;

use warnings;
use strict;
use Scalar::Util qw(openhandle);
use IPC::Open3;

my $fname = "/var/log/file.log.1.gz";
my $pid = open(my $fh, "-|:encoding(UTF-8)", "gunzip -c \"$fname\" | tac");

# Read one line from the file
while (my $row = <$fh>) {
    print "Row: $row\n";
    last; # Bail out early
}

# Check if the PID is valid and kill it if so
if (kill(0, $pid) == 1) {
    kill(15, $pid);
    waitpid($pid, 0);
    $pid = 0;
}

# Close the filehandle if it is still open
if (openhandle($fh)) {
    close $fh;
}

The above works, except that I get errors from tac in the logs saying:

tac: write error

From what I can tell from various testing and research that I've done, this is happening because killing the PID returned from open() just kills the first child process (but not the second) and so when I then close the filehandle, tac is still writing to it, thus the "write error" due to the broken pipe. The strange thing is, at times when I check ($? >> 8) if the close() call returns false, it will return 141, indicating it received a SIGPIPE (backing up my theory above). However, other times it returns 0 which is strange.

Furthermore, if I run the same command but without a double-pipe (only a single one), like this (everything else the same as above):

my $pid = open(my $fh, "-|:encoding(UTF-8)", "gunzip -c \"$fname\"");

...I'll get an error in the logs like this:

gzip: stdout: Broken pipe

...but in this case, gunzip/gzip was the only process (which I killed via the returned PID), so I'm not sure why it would still be writing to the pipe when I close the filehandle (since it was supposed to be killed already, AND waited for with waitpid()).

I'm trying to repro this in the Perl debugger but its difficult because I can't get the stderr of the child process with plain open() (the way I'm seeing the external process' stderr in prod is in the apache2 logs - this is a CGI script).

I understand from reading the docs that I can't get the PID of all child processes in a multi-piped open with open(), so I decided to try and resort to a different method so that I could close all processes cleanly. I tried open3(), and interestingly, without making any changes (literally running basically the same exact scenario as above but with open3() instead of open()):

my $pid = open3(my $in, my $fh, undef, "gunzip -c \"$fname\"");

...and then killing it just like I did above, I don't get any errors. This holds true for both the single piped process as shown above, as well as the double-piped process that involves piping to "tac".

Therefore, I'm wondering what I am missing here? I know there are differences in the way open() and open3() work, but are there differences in the way that child processes are spawned from them? In both cases I can see that the initial child (the PID returned) is itself a child of the Perl process. But its almost as if the process spawned by open(), is not getting properly killed and/or cleaned up (via waitpid()) while the same process spawned by open3() is, and that's the part I can't figure out.

And, more to the bigger picture and the issue at hand - what is the suggestion for the best way to cleanly close a multi-piped process in this sort of scenario? Am I spending more time than is warranted on this? The script itself works as it should aside from these errors, so if it turns out that the tac and gzip errors I'm seeing are inconsequential, should I just live with them and move on?

Any help is much appreciated!

Upvotes: 2

Views: 322

Answers (2)

user10678532
user10678532

Reputation:

This happens because either your perl script or its parent is ignoring the SIGPIPE signal, and the ignore signal dispositions are inherited by the children.

Here is a simpler testcase for your condition:

$ perl -e '$SIG{PIPE}="IGNORE"; open my $fh, "-|", "seq 100000 | tac; true"; print scalar <$fh>'
100000
tac: write error
$ (trap "" PIPE; perl -e 'open my $fh, "-|", "seq 100000 | tac"; print scalar <$fh>')
100000
tac: write error
$ (trap "" PIPE; perl -e 'my $pid = open my $fh, "-|", "seq 100000 | tac"; print scalar <$fh>; kill 15, $pid; waitpid $pid, 0')
100000
$ tac: write error

The latter version does the same kill as the version from the OP, which will not kill either the right or left side of the pipeline, but the shell running and waiting for both (some shells will exec through the left side of a pipeline; with such shells, a ; exit $? could be appended to the command in order to reproduce the example).

A case where SIGPIPE is ignored upon entering a perl script is when run via fastcgi -- which sets the SIGPIPE disposition to ignore, and expects the script to handle it. In that case simply setting an SIGPIPE handler instead of IGNORE (even an empty handler) would work, since in that case the signal disposition will be reset to default upon executing external commands:

$SIG{PIPE} = sub { };
open my $fh, '-|', 'trap - PIPE; ... | tac';

When run as a standalone script it could be some setup bug (I've see it happen in questions related to containerization on Linux), or someone trying to exploit buggy programs running with elevated privileges not bothering to handle write(2) errors (EPIPE in this case).

my $pid = open3(my $in, my $fh, undef, "gunzip -c \"$fname\"");

...and then killing it just like I did above, I don't get any errors.

Where should you get the errors from, if you're redirecting its stderr to the same $fh you only read the first line from?

The thing is absolutely no different with open3:

$ (trap "" PIPE; perl -MIPC::Open3 -e 'my $pid = open3 my $in, my $out, my $err, "seq 100000 | tac 2>/dev/tty"; print scalar <$out>')
100000
$ tac: write error

Upvotes: 2

Shawn
Shawn

Reputation: 52439

If you just want to read the last line of a gzipped file, it's easy to do it in pure perl without calling an external program:

#!/usr/bin/env perl
use warnings;
use strict;
use feature qw/say/;
use IO::Uncompress::Gunzip qw/$GunzipError/;

my $fname = 'foo.txt.gz';
my $z = new IO::Uncompress::Gunzip $fname or die "Couldn't open file: $GunzipError\n";
my $row;
while (<$z>) {
  $row = $_;
}
say "Row: $row";

Upvotes: 3

Related Questions