Reputation: 35321
I want to extract some data from a large-ish (3+ GB, gzipped) FTP download, and do this on-the-fly, to avoid dumping then full download on my disk.
To extract the desired data I need to examine the uncompressed stream line-by-line.
So I'm looking for the moral equivalent of
use PerlIO::gzip;
my $handle = open '<:gzip', 'ftp://ftp.foobar.com/path/to/blotto.txt.gz'
or die $!;
for my $line (<$handle>) {
# etc.
}
close($handle);
FWIW: I know how to open a read handle to ftp://ftp.foobar.com/path/to/blotto.txt.gz
(with Net::FTP::repr
), but I have not yet figured out how to add a :gzip
layer to this open handle.
It took me a lot longer than it should have to find the answer to the question above, so I thought I'd post it for the next person who needs it.
Upvotes: 0
Views: 122
Reputation: 3705
The code below is from IO::Compress FAQ
use Net::FTP;
use IO::Uncompress::Gunzip qw(:all);
my $ftp = new Net::FTP ...
my $retr_fh = $ftp->retr($compressed_filename);
gunzip $retr_fh => $outFilename, AutoClose => 1
or die "Cannot uncompress '$compressed_file': $GunzipError\n";
To get the data line by line, change it to this
use Net::FTP;
use IO::Uncompress::Gunzip qw(:all);
my $ftp = new Net::FTP ...
my $retr_fh = $ftp->retr($compressed_filename);
my $gunzip = new IO::Uncompress::Gunzip $retr_fh, AutoClose => 1
or die "Cannot uncompress '$compressed_file': $GunzipError\n";
while(<$gunzip>)
{
...
}
Upvotes: 1
Reputation: 35321
OK, the answer is (IMO) not at all obvious: binmode($handle, ':gzip')
.
Here's a fleshed-out example:
use strict;
use Net::FTP;
use PerlIO::gzip;
my $ftp = Net::FTP->new('ftp.foobar.com') or die $@;
$ftp->login or die $ftp->message; # anonymous FTP
my $handle = $ftp->retr('/path/to/blotto.txt.gz') or die $ftp->message;
binmode($handle, ':gzip');
for my $line (<$handle>) {
# etc.
}
close($handle);
Upvotes: 1