Sri Ram
Sri Ram

Reputation: 1

How to get the specific <td> while scraping a web page table using a DOM?

I've a table, of whose number of columns can change depending on the configuration of the scrapped page (I have no control of it). I want to get only the information from a specific column, designated by the columns heading.

Sample table:

<table>
    <tr>
        <td>Name</td>
        <td>Age</td>
        <td>Marks</td>
    </tr>
    <tr>
        <td>A</td>
        <td>20</td>
        <td>90</td>
    </tr>
    <tr>
        <td>B</td>
        <td>21</td>
        <td>80</td>
    </tr>
    <tr>
        <td>C</td>
        <td>22</td>
        <td>70</td>
    </tr>
</table>

My working PHP code to display all columns:

foreach($html->find("table#table2 tr td") as $td) {
  $code = $td;
  echo $code;
}

Needed code format:

foreach($html->find('table#table2 td') as $td) {
  /* Get td1 data */ 
  /* Code1 to store td data 1 */

  /* Get next td data */ 
  /* Code2 to store td data 2 */

  /* Get the next td data */ 
  /* Code3 to store td data 3 */
}

I want to extract the output and store it to a DB table having table name result in the appropriate columns.

I can write the storing code myself. I need a code to retrieve the consecutive td data inside a row without a loop.Since the code to store td data varies.

Posts I referred - scraping webpage.

Upvotes: 0

Views: 5200

Answers (2)

Patt Mehta
Patt Mehta

Reputation: 4194

// Create DOM from URL or file
$html = file_get_html("http://www.example.org/");

// Find the tr array
$tr_array = $html->find("table#table2 tr");

$td_array = [];
// Find the td array
foreach($tr_array as $tr) {
    array_push($td_array,$tr->find("td"));
}

echo "<table id=\"table1\">";
foreach($tr_array as $tr) {
    echo "<tr>";
    foreach($td_array as $td) {
        echo $td;
    }
    echo "</tr>";
}
echo "</table>";

For advanced topics, read simplehtmldom.


In the above code, I've stored array objects inside arrays:

<?php

$a = [];
$a1 = [1,2,3];
$a2 = [4,5,6];
array_push($a,$a1,$a2);
foreach($a as $a_e) {
  foreach($a_e as $e) {
    echo $e;
  }
  echo "<br>";
}

?>

Outputs:

123
456

Upvotes: 1

M Shahzad Khan
M Shahzad Khan

Reputation: 935

Getting all td's in specific table

//get into specific table. table number is from 0,1,2,3.. in your whole html returned
$table = $html->find('table', tableNumber);
$td = $html->find('td');
foreach($td as $tds)
{
  echo $tds;
}

Upvotes: 0

Related Questions