Parse HTML Table with DOM and XPath

Question

I'm trying to parse an HTML Table with XPath. The URL is: click here.

I use FireBug to see page's DOM and i understand the container i need.






19/4 18:30

CHELSEA FC - SUNDERLAND


1,21
92,8%






8,00
4,7%






18,00
2,5%






353.660 €





1,56
67,5%


2,74
32,5%


6.243 €

This is only one row, there are hundreds more. So we have all rows with informations and we can check every single line and check whether it contains date, match, money etc ... i need to make a condition for each of them, to store all of them in an array.

I follow this tutorial: click here

Wich condition i can use to differentiate each cells from another?

I want to have something like this for each rows in the table:

[0] => Array
            (
                [date] => 18:30 19/4
                [teams] => CHELSEA FC - SUNDERLAND
                [1] => 1,21
                [1 volumes] => 92,8%
                [X] => 8,00
                [X volumes] => 4,7%
                [2] => 18,00
                [2 volumes] => 2,5%
                [matched] => 353.660 € 
                  ...

            )

This is the php, i'm blocked at this point:

loadHTML($html);

$xpath = new DOMXPath($dom);

$scores = array();

$tableRows = $xpath->query('//div//div//div[2]//div/div//table//tr');

foreach ($tableRows as $row) {
    // fetch all 'tds' inside this 'tr'
    $td = $xpath->query('td', $row);
    $match = array();

Jens Erat · Accepted Answer

Your query is fetching all table rows so far. In the next step, loop over these results (in PHP) and access the rows as needed. You might either want to use direct DOM access or XPath, whatever you prefer.

For using XPath, use an XPath expression that starts querying at the current context, and pass the current row as such. Use numerical predicates to limit to the row you're looking for. For example, to query the team name (in the third table cell, XPath counts 1-indexed), use something like

$tableRows = $xpath->query('//div//div//div[2]//div/div//table//tr');
foreach ($tableRows as $row) {
    $team = $xpath->query('./td[3]/a', $row)->item(0)->textContent;
}

Querying the class attributes might also be possible, but they seem to be used rather arbitrarily.

Now, read the other table rows with similar queries, construct the resulting map and append it to the $scores array.

Parse HTML Table with DOM and XPath

Answers (1)

Related Questions