Michael
Michael

Reputation: 1

Build array from parsed HTML table

I've just started trying to teach myself PHP and need a little help. Using DOM, I'm trying to parse an HTML table to put the results into a MySQL dbase, but am having problems. I am able to echo each row of the table with this:

foreach ($html->find('table') as $table)
foreach ($table->find("tr") as $rows)
    echo $rows."<br />";

The data has this structure:

<tr>
<a href=http://some link>text1</a>
<td class="...">text2</td>
<td class="...">text3</td>
<td class="...">text4</td>
</tr>

I'm trying to get the link and text1-4 into an array, but can't figure it out. Any help would be appreciated. --Edit--
This is the table layout that I'm trying to break down.

<tr>
<th class="school first">
    <ahref="/local/team/home.aspx?schoolid=472d8593">Aberdeen</a>
</th>
<td class="mascot">Bulldogs</td>
<td class="city">Aberdeen</td>
<td class="state last">MS</td>
</tr>

jnpcl's answer gives me this

  ROW
  TEXT: Bulldogs
  TEXT: Aberdeen
  TEXT: MS

but no link. I probably wasn't specific enough in my original question, but, like I said, I'm trying to learn, and I usually do it by jumping in the deep end of the pool.

Upvotes: 0

Views: 2492

Answers (1)

jrn.ak
jrn.ak

Reputation: 36619

Update: Should now work with OP's updated sample code.

This should get you going:

table-array.php

<?php
    // SimpleHTMLDom Library
    require_once('lib/simple_html_dom.php');

    // Source Data
    $source = 'table-array-data.htm';

    // Displays Extra Debug Info
    $dbg = 1;

    // Read DOM
    $html = file_get_html($source);

    // Confirm DOM
    if ($html) {
        // Debug Output
        if ($dbg) { echo '<pre>'; }

        // Loop for each <table>
        foreach ($html->find('table') as $table) {

            // Debug Output
            if ($dbg) { echo 'TABLE' . PHP_EOL; }

            // Loop for each <tr>
            foreach ($table->find('tr') as $row) {

                // Debug Output
                if ($dbg) { echo '  ROW' . PHP_EOL; }

                // Loop for each <th>
                foreach ($row->find('th') as $cell) {

                    // Look for <a> tag
                    $link = $cell->find('a');

                    // Found a link
                    if (count($link) == 1) {

                        // Debug Output
                        if ($dbg) { echo '      LINK: ' . $link[0]->innertext . ' (' . $link[0]->href . ')' . PHP_EOL; }
                    }

                }

                // Loop for each <td>
                foreach ($row->find('td') as $cell) {

                    // Debug Output
                    if ($dbg) { echo '      CELL: ' . $cell->innertext . PHP_EOL; }
                }
            }
        }
        // Debug Output
        if ($dbg) { echo '</pre>'; }
    }
?>

table_array_data.htm

<table>
    <tr class="first">
        <th class="school first"><a href="/local/team/home.aspx?schoolid=472d8593-9099-4925-81e0-ae97cae44e43&amp">Aberdeen</a></th>
        <td class="mascot">Bulldogs</td>
        <td class="city">Aberdeen</td>
        <td class="state last">MS</td>
    </tr>
</table>

output

TABLE
    ROW
        LINK: Aberdeen (/local/team/home.aspx?schoolid=472d8593-9099-4925-81e0-ae97cae44e43&)
        CELL: Bulldogs
        CELL: Aberdeen
        CELL: MS

Upvotes: 2

Related Questions