Reputation: 13
I have a subroutine that is called through another script to read the HTML file. Below is the code.
sub read_html
{
$data=`cat "$_[0]"`;
use HTML::TableExtract;
print "CALLING read_html to read $_[0]\n";
#my $self = shift;
print "$_[1]";
$te = HTML::TableExtract->new( headers => [($_[1])] );
$te->parse($data);
my $line_cnt=0;
# Examine all matching tables
foreach $ts ($te->tables)
{
if ($ts->rows ne "")
{
foreach $row ($ts->rows)
{
foreach (@$row) { $_='' unless defined $_; }
print @$row;
if (@$row[0] ne ' ' and @$row[0] ne '' and
@$row[0] ne "\n" and @$row[0] ne "\t")
{
$line_cnt++;
}
}
}
return $line_cnt;
}
}
When I run the above script, it doesn’t show me the HTML table data when the header is passed as the variable.
$te = HTML::TableExtract->new( headers => [($_[1])] );
However if I replace the expression $_[1]
with the hard coded values like below, it returns all the column values under the specified headers
$te = HTML::TableExtract->new(
headers => [("PO Number",
"Invoice Number",
"DC Number",
"Store Number",
"Invoice Amount",
"Discount",
"Amount Paid")] );
I am calling the subroutine as read_html($file, $headers)
where $file
is a file name and $headers
holds the header values, comma separated.
Any help would be greatly appreciated.
Upvotes: 0
Views: 705
Reputation: 14714
I am calling the subroutine as read_html($file, $headers) where $file is a file name and $headers as the header values comma separated.
The headers parameter of HTML::TableExtract->new
expects a reference to an array of strings, where each string is a separate header. It sounds like you are instead passing it a reference to an array containing a single string containing comma characters.
my @headers = split m(\s*,\s*), $_[1];
$te = HTML::TableExtract->new( headers => \@headers );
If this is not correct, then your question needs to be more specific with regards to how you are calling read_html
.
Upvotes: 1