Madhan
Madhan

Reputation: 1321

How to reading text file rapidly using perl?

My CGI page is consuming more time to read and manipulate text from text file.
I have stored thousands of records in a text file in the below format.

|!| Row 1 |!| Row 2 |!| Row 3 |!| Row 4 |!| Row 5 |!| Row 6 |!| Row 7
|!| Row 1 |!| Row 2 |!| Row 3 |!| Row 4 |!| Row 5 |!| Row 6 |!| Row 7
|!| Row 1 |!| Row 2 |!| Row 3 |!| Row 4 |!| Row 5 |!| Row 6 |!| Row 7

I am displaying the above text data in cgi page by splitting the " |!| " separator with the help of . The code that I am using is below.

use strict;
use CGI;
use File::Slurp;

my $htmls = CGI->new();

my ($recordfile, @content, $tablefields);
$recordfile   = 'Call.txt';
@content      = read_file($recordfile);
$tablefields  = validate_records(\@content);

sub validate_records {
    my @all_con = @{(shift)};
    my $tab_str;
    my $cnts    = 0;
    foreach my $rec_ln (@all_con) {
        $cnts++;
        chomp($rec_ln);
        push my @splitted, split(/ \|\!\| /, $rec_ln);

        my $radioStr = "<input type=\"radio\" name=\"cell\" value=\"$rec_ln\"\/>";

        $tab_str.="<tr>
       <td style=\"text-align\:center\;\">$radioStr</td>                               
       <td>$splitted[1]</td>
       <td>$splitted[2]</td>
       <td>$splitted[3]</td>
       <td>$splitted[4]</td>
       <td>$splitted[5]</td>
       <td>$splitted[6]</td>
       <td>$splitted[7]</td>
   </tr>";

        $tab_str=~s/<td><\/td>/<td>N\/A<\/td>/igs;
   }
   return $tab_str;    
}

print
$htmls->header(),
'<html>
   <head></head>
   <body>
            <table border="1" align="center" width="100%" id="table" style="margin-top:35px;border:0px;" class="TabClass"><thead>
              <tr>
                <th>SELECT</th>                 
                <th>HEADER 2</th>
                <th>HEADER 3</th>
                <th>HEADER 4</th>
                <th>HEADER 5</th>
            <th>HEADER 6</th>
                <th>HEADER 7</th>
              </tr>
           </thead>'.
              $tablefields.
            '</table>

   </body>
   </html>'; 

The above code is taking more than two minutes to display all data in my page, whenever the file contain more records. Is any possibilities are there to read and manipulate file records rapidly?

Please share your suggestions.

Upvotes: 0

Views: 150

Answers (2)

Toto
Toto

Reputation: 91373

At first, extract the line

$tab_str =~ s/<td><\/td>/<td>N\/A<\/td>/igs;

out of the foreach loop.

Upvotes: 2

TLP
TLP

Reputation: 67900

Why does your program take a long time to run? Lets check what your program does:

First you slurp the contents of the file into @content. Then you copy the values over to @all_con inside the subroutine. You have now in rapid succession used up twice your file size worth of memory, which will not be returned until the end of your program.

Now you loop over and split the file lines, and perform some concatenations and end up with a string that is more than twice as long as the original line. Then you string all those lines together, and for each new addition, you perform a substitution on the entire growing line to check for empty cells. You now have 4 times your original file size in memory, and you are performing a regex substitution on it.

What you should do is:

Remove the delimiter |!|, and use a proper serialisation module, such as Text::CSV. Pass the file name to the subroutine, and parse the file using a while loop:

my $csv = Text::CSV->new({ binary => 1 });   # using comma delimiter
open my $fh, "<", $file or die "Cannot open $file: $!";
while (my $row = $csv->getline($fh)) {
    print .... ;                             # print directly
}

The Text::CSV module is very efficient and the csv format is reliable. Because you iterate over the file handle, and print directly, you do not store data in memory unnecessarily.

Also, instead of using a substitution to check for empty fields, you can do that directly when concatenating your string:

print start_table(), "<tr>";
for (@$row) {
    my $val = $_;
    if ($val !~ /\S/) {   # contains no non-whitespace
        $val = "N/A";
    }
    print "\t", td($val);
}

Upvotes: 1

Related Questions