Reputation: 13116
There is a system()
call in a Perl script with multiple pipes, using a single scalar argument. The call looks more or less like this:
system("zcat /foo.gz | grep '^.{6}X|Y|Z' | awk '{print $2,$3,$4,$6}' | bzip2 > /foo.processed.bz2");
The file in question (foo.gz
) is quite large, about 2GB compressed in size. I guess that's why it was originally done via a system call.
Questions:
The problem now is, that this system call always seem to return 0, whether one of the system commands fail or not. I assume this is because it gets invoked via sh -c '...'
. Is that correct?
Is there a way to check if a system()
call was successful if only a single scalar argument is passed?
Is there a better way to process a large file like this, in a way thats equally or more efficient (in terms of speed mainly)?
Thanks for any hints as I am not really familiar with Perl.
Upvotes: 2
Views: 1059
Reputation: 107040
Two things:
bzip2
command.You're always better off using Perl modules for things like this. In this case, I bet the Perl modules will be even faster than the shell pipeline, and you'll have more control over the entire operation.
There's a set called IO::Compress that can handle both Zip and BZip2.
I use Archive::Zip which is a great module, but you want to use the Bzip2 compression algorithm, and Archive::Zip
can't handle that.
Upvotes: 3
Reputation: 13116
Based on your comments and answers, I'd do it like that now:
$infile =~ s/(.*\.gz)\s*$/gzip -dc < $1|/;
open(OUTFH, "| /bin/bzip > $outfile") or die "Can't open $outfile: $!";
open(INFH, $infile) or die "Can't open $infile: $!";
while (my $line = <INFH>) {
if ($line =~ /^.{6}X|Y|Z) {
# TODO: the awk part...
print OUTFH $line;
}
}
close(INFH);
close(OUTFH);
Please feel free to comment and vote up/down.
Upvotes: 1
Reputation: 43508
system()
returns what the /bin/sh
shell returns. When multiple commands are pipelined, the shell forks a new process for each of them and the status code of the last command in the chain is returned, in this case bzip2
.
Upvotes: 2
Reputation: 83
You'd be better doing the text processing from within perl itself - that's what perl's for :)
system() only ever returns 0 or 1. To capture actual output, try calling it via backticks: `command` rather than system('command')
Upvotes: -1