Vicks
Vicks

Reputation: 13

HTML::TableExtract - passing an expression for headers

I have a subroutine that is called through another script to read the HTML file. Below is the code.

sub read_html
{
    $data=`cat "$_[0]"`;
    use HTML::TableExtract;
    print "CALLING read_html to read $_[0]\n";
    #my $self = shift;
    print "$_[1]";
    $te = HTML::TableExtract->new( headers => [($_[1])] );
    $te->parse($data);
    my $line_cnt=0;
    # Examine all matching tables
    foreach $ts ($te->tables)
    {
        if ($ts->rows ne "")
        {
            foreach $row ($ts->rows)
            {
                foreach (@$row) { $_='' unless defined $_; }
                print @$row;
                if (@$row[0] ne ' '  and @$row[0] ne ''  and
                    @$row[0] ne "\n" and @$row[0] ne "\t")
                {
                    $line_cnt++;
                }
            }
        }
        return $line_cnt;
    }
}

When I run the above script, it doesn’t show me the HTML table data when the header is passed as the variable.

$te = HTML::TableExtract->new( headers => [($_[1])] );

However if I replace the expression $_[1] with the hard coded values like below, it returns all the column values under the specified headers

$te = HTML::TableExtract->new(
    headers => [("PO Number",
                 "Invoice Number",
                 "DC Number",
                 "Store Number",
                 "Invoice Amount",
                 "Discount",
                 "Amount Paid")] );

I am calling the subroutine as read_html($file, $headers) where $file is a file name and $headers holds the header values, comma separated.

Any help would be greatly appreciated.

Upvotes: 0

Views: 705

Answers (1)

Oktalist
Oktalist

Reputation: 14714

I am calling the subroutine as read_html($file, $headers) where $file is a file name and $headers as the header values comma separated.

The headers parameter of HTML::TableExtract->new expects a reference to an array of strings, where each string is a separate header. It sounds like you are instead passing it a reference to an array containing a single string containing comma characters.

my @headers = split m(\s*,\s*), $_[1];
$te = HTML::TableExtract->new( headers => \@headers );

If this is not correct, then your question needs to be more specific with regards to how you are calling read_html.

Upvotes: 1

Related Questions