kailash19
kailash19

Reputation: 1821

Perl: Read from file till specified character(s) found

I have a very huge(10 GB) single line file(basically insert statement) which i cant load into memory. I want to process that line(doing some regex) and taking meaning full values.

The values are in tuples(data is between-> (.*) ).

So i want to just read each tuple from the file and process it.

What i am thinking of doing is using getc like this:

getc FILEHANDLE

So i read each character and check if it matches my tuple ending character(in my case it is ), ).

Is there a more efficient and better way to perform this in optimized way?

Thanks.

Upvotes: 1

Views: 675

Answers (2)

davewood
davewood

Reputation: 206

You could set the special perl variable INPUT_RECORD_SEPARATOR $/ to match your tuple-ending character.

#!/usr/bin/env perl
use strict;
use warnings;
use feature qw/ say /;

open( my $fh, '<', 'foo.txt' ) or die;
my $tuple_ending_char = ')';
local $/ = $tuple_ending_char;

while (<$fh>) {
    say $_;
}

Upvotes: 3

vinod
vinod

Reputation: 281

You can try the following code also but it is not as elegant as davewood's solution.

use strict;
use Data::Dumper;
my $filename='/tmp/sample.txt';
if (open(my $fh, $filename)) {
    my @file_stats = stat($fh);
    my $bytes_remaining = $file_stats[7];
    my $answer = "";
    my $buffer_size=1024;
    while (1) {
        my $bytes_read = read($fh, $answer, $buffer_size);
        my @tuples = ($answer =~ /\(.*?\),\s*/g);
        print Dumper(\@tuples);
        $answer =~ s/.*\)\s*,\s*([^\)]*)$/$1/g;
        $bytes_remaining -= $bytes_read;
        if ($bytes_remaining < 0) {$bytes_remaining = 0;}
        if (($bytes_read == 0) ||($bytes_remaining <= 0)) {
            last;
        };
    };
    close($fh);
}

Upvotes: 2

Related Questions