Reputation: 1821
I have a very huge(10 GB) single line file(basically insert statement) which i cant load into memory. I want to process that line(doing some regex) and taking meaning full values.
The values are in tuples(data is between-> (.*) ).
So i want to just read each tuple from the file and process it.
What i am thinking of doing is using getc like this:
getc FILEHANDLE
So i read each character and check if it matches my tuple ending character(in my case it is ), ).
Is there a more efficient and better way to perform this in optimized way?
Thanks.
Upvotes: 1
Views: 675
Reputation: 206
You could set the special perl variable INPUT_RECORD_SEPARATOR $/ to match your tuple-ending character.
#!/usr/bin/env perl
use strict;
use warnings;
use feature qw/ say /;
open( my $fh, '<', 'foo.txt' ) or die;
my $tuple_ending_char = ')';
local $/ = $tuple_ending_char;
while (<$fh>) {
say $_;
}
Upvotes: 3
Reputation: 281
You can try the following code also but it is not as elegant as davewood's solution.
use strict;
use Data::Dumper;
my $filename='/tmp/sample.txt';
if (open(my $fh, $filename)) {
my @file_stats = stat($fh);
my $bytes_remaining = $file_stats[7];
my $answer = "";
my $buffer_size=1024;
while (1) {
my $bytes_read = read($fh, $answer, $buffer_size);
my @tuples = ($answer =~ /\(.*?\),\s*/g);
print Dumper(\@tuples);
$answer =~ s/.*\)\s*,\s*([^\)]*)$/$1/g;
$bytes_remaining -= $bytes_read;
if ($bytes_remaining < 0) {$bytes_remaining = 0;}
if (($bytes_read == 0) ||($bytes_remaining <= 0)) {
last;
};
};
close($fh);
}
Upvotes: 2