Reputation: 356
My Perl code using HTML::TableExtract
doesn't work.
Here is my code
#!/usr/bin/perl
use strict;
use warnings;
use HTML::TableExtract;
## Exactract table from html file
my $te = new HTML::TableExtract( attribs => { border => 0} );
$te->parse_file("file_path.html");
my $table = $te->tables;
for my $row ($table->rows) {
print join(',', @$row), "\n";
}
I keep having this error
Can't call method "rows" without a package or object reference at ./parse_table.pl line 13.
Here is my HTML file, truncated to show only the table I am interested in. http://phucnvo.myvnc.com/sandbox/out.html
<div>
<form name="listAssignmentsForm" action="https://t-square.gatech.edu/portal/tool/3a34f619-99d1-4548-be57-9ee977fd8127?panel=Main"
method="post">
<input type="hidden" name="source" value="0"/>
<table class="listHier lines nolines" border="0" cellspacing="0"
summary="List of assignments. Column headers are also links which can be used to sort the table by that column. Column 1: Indicates if the assignment has attachments. Column 2: assignment title and links to edit, duplicate or grade(if allowed). Column 3: status. Column 4: opening date. Column 5: due date. The rest of the columns may or may not be present. Column 6: may have the number submitted and graded. Column 7: may have checkboxes to select and remove the assignment.">
<tr>
<th id="attachments" class="attach"> </th>
<th id="title">
<a href="#" onclick="location='url'; return false;" title="Sort by title"> Assignment title </a>
</th>
<th id="For">
<a href="#" onclick="location='url'; return false;" title="Sort by audience">For</a>
</th>
<th id="status">
<a href="#"
onclick="location='https://t-square.gatech.edu/portal/tool/3a34f619-99d1-4548-be57-9ee977fd8127?criteria=assignment_status&panel=Main&sakai_action=doSort'; return false;"
title="Sort by status"> Status </a>
</th>
<th id="openDate">
<a href="#"
onclick="location='https://t-square.gatech.edu/portal/tool/3a34f619-99d1-4548-be57-9ee977fd8127?criteria=opendate&panel=Main&sakai_action=doSort'; return false;"
title="Sort by section"> Open </a>
</th>
<th id="dueDate">
<a href="#"
onclick="location='https://t-square.gatech.edu/portal/tool/3a34f619-99d1-4548-be57-9ee977fd8127?criteria=duedate&panel=Main&sakai_action=doSort'; return false;"
title="Sort by due date"> Due </a>
</th>
</tr>
<tr>
<td headers="attachments" class="attach">
<img id="attachment1" src="/library/image/sakai/attachments.gif?panel=Main" alt="Attachments" width="13" height="11" border="0"/>
</td>
<td headers="title">
<h4><a href="url">Project 7</a></h4>
</td>
<td style="padding-bottom:0"> site </td>
<td headers="status"> Submitted Jul 24, 2013 12:24 am </td>
<td headers="openDate"> Jul 19, 2013 12:00 pm </td>
<td headers="dueDate"> Jul 26, 2013 11:55 pm </td>
</tr>
<tr>
<td headers="attachments" class="attach">
<img id="attachment2" src="/library/image/sakai/attachments.gif?panel=Main" alt="Attachments" width="13" height="11" border="0"/>
</td>
<td headers="title">
<h4><a href="url">Project 6</a></h4>
</td>
<td style="padding-bottom:0"> site </td>
<td headers="status"> Submitted Jul 19, 2013 4:33 am </td>
<td headers="openDate"> Jul 11, 2013 12:00 pm </td>
<td headers="dueDate"> Jul 18, 2013 11:55 pm </td>
</tr>
<tr>
<td headers="attachments" class="attach">
<img id="attachment3" src="/library/image/sakai/attachments.gif?panel=Main" alt="Attachments" width="13" height="11" border="0"/>
</td>
<td headers="title">
<h4><a href="url">Project 5</a></h4>
</td>
<td style="padding-bottom:0"> site </td>
<td headers="status"> Submitted Jul 10, 2013 11:37 pm </td>
<td headers="openDate"> Jun 27, 2013 12:00 pm </td>
<td headers="dueDate"> Jul 10, 2013 11:55 pm </td>
</tr>
<tr>
<td headers="attachments" class="attach">
<img id="attachment4" src="/library/image/sakai/attachments.gif?panel=Main" alt="Attachments" width="13" height="11" border="0"/>
</td>
<td headers="title">
<h4><a href="url">Threads Practice </a></h4>
</td>
<td style="padding-bottom:0"> site </td>
<td headers="status"> Not Started </td>
<td headers="openDate"> Jun 27, 2013 12:00 pm </td>
<td headers="dueDate"> Jun 27, 2013 12:05 pm </td>
</tr>
<tr>
<td headers="attachments" class="attach">
<img id="attachment5" src="/library/image/sakai/attachments.gif?panel=Main" alt="Attachments" width="13" height="11" border="0"/>
</td>
<td headers="title">
<h4><a href="url">Project 4</a></h4>
</td>
<td style="padding-bottom:0"> site </td>
<td headers="status"> Submitted Jun 27, 2013 4:58 am </td>
<td headers="openDate"> Jun 20, 2013 1:00 am </td>
<td headers="dueDate"> Jun 26, 2013 11:55 pm </td>
</tr>
<tr>
<td headers="attachments" class="attach">
<img id="attachment6" src="/library/image/sakai/attachments.gif?panel=Main" alt="Attachments" width="13" height="11" border="0"/>
</td>
<td headers="title">
<h4><a href="url">Project 3</a></h4>
</td>
<td style="padding-bottom:0"> site </td>
<td headers="status"> Submitted Jun 20, 2013 3:19 am </td>
<td headers="openDate"> Jun 6, 2013 12:00 pm </td>
<td headers="dueDate"> Jun 19, 2013 11:55 pm </td>
</tr>
<tr>
<td headers="attachments" class="attach">
<img id="attachment7" src="/library/image/sakai/attachments.gif?panel=Main" alt="Attachments" width="13" height="11" border="0"/>
</td>
<td headers="title">
<h4><a href="url">Project 2</a></h4>
</td>
<td style="padding-bottom:0"> site </td>
<td headers="status"> Submitted Jun 5, 2013 5:39 am </td>
<td headers="openDate"> May 28, 2013 12:00 pm </td>
<td headers="dueDate"> Jun 4, 2013 11:55 pm </td>
</tr>
<tr>
<td headers="attachments" class="attach">
<img id="attachment8" src="/library/image/sakai/attachments.gif?panel=Main" alt="Attachments" width="13" height="11" border="0"/>
</td>
<td headers="title">
<h4><a href="url">Project 1: Processor Design</a></h4>
</td>
<td style="padding-bottom:0"> site </td>
<td headers="status"> Submitted May 31, 2013 2:09 am </td>
<td headers="openDate"> May 16, 2013 1:40 pm </td>
<td headers="dueDate"> May 30, 2013 11:55 pm </td>
</tr>
</table>
</form>
</div>
What I expect to see are assignment title, status, open date, and close date.
Upvotes: 1
Views: 1134
Reputation: 62089
As ysth suggested, your problem is right here:
my $table = $te->tables;
tables
is plural, suggesting it should be called in list context. You're calling it in scalar context. In Perl, many functions that return a list will return the length of that list if called in scalar context. tables
is one of them, so $table
gets set to 1. You can't call methods on a number (well, not without autobox
).
Try this:
my ($table) = $te->tables;
The parens before the assignment make it a list assignment. $table
gets the first table found, and any additional tables are discarded.
Upvotes: 3
Reputation: 98388
The doc says:
tables()
Return table objects for all tables that matched. Returns an empty list if no tables matched.
It is expecting to be called like:
my @tables = $te->tables();
and apparently it isn't finding any, so is returning nothing.
Perhaps you could provide a trimmed down version of your html that still demonstrates the problem and tell what you expect to happen?
Upvotes: 2